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(§) Cylosporin synthetase. 

(57) The nucleotide sequence which codes for cyclosporin synthetase and similar enzymes and recombin- 
ant vectors containing the sequence. The vectors are used in methods tor the production of cyclosporin 
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This inv ntion r lates to nucl otid sequ nces which code for enzymes possessing cycl sporin synthe- 
tase-!^ activity and to m thods f r th production f cyclosporin and cyclosporin derivatives using these se- 
quences. 

The fungus Tolypocladium niveum (previously known as Totypocladium inf latum GAMS) produces cydo- 

5 sporins, a group of neutral cyclic p ptides composed felev namin acids. Oth r fungi have been found which 
may form cyclosporins (Dreyfuss, 1 986; Nakajima etaL, 1 989) but Tolypocladium niveum is the most important 
organism for the production of cyclosporins by fermentation. Cyclosporins exhibit remarkable biological ef- 
fects: for example cyclosporin A, the main metabolite, is a potent immunosuppressant (Borel et al., 1976). An 
enzyme has been identified which catalyses the entire peptide biosynthesis of cyclosporin and is therefore 

10 called cyclosporin synthetase (Zocher et al., 1986, Billich and Zocher 1987). The biosynthesis proceeds non- 
ribosomalry by a thiotemplate process, as has also been described for other peptide synthetases (Kleinkauf 
and von DShren 1 990). Each amino acid is first activated in the form of an adenylate, then bound in the form 
of a thioester and linked with the following amino acid to the peptide. In the case of cyclosporin A, seven of 
the amino acids, bound as thioesters, are methylated before they are linked to the preceding amino acid in a 

15 peptide bond. This methylation function is an integral constituent of the enzyme polypeptide (Lawen and Zocher 
1990). Including the cyclisation reaction, cyclosporin synthetase performs at least 40 reactions. 

Cyclosporin A contains three non-proteinogenic amino acids: D-alanine in position 8, a-amino butyric acid 
in position 2 and, in position 1, the unusual amino acid (4R)-4-[(E)-2- butenyQ-4-methyt-L-threonine (Bmt or 
C9 amino acid). All three amino acids must be each prepared by a biosynthetic pathway which is independent 

20 of the primary biosynthetic pathway. Cyclosporin synthetase does not possess an alanine-racemase function 
(Kleinkauf and von Ddhren 1990) and thus, D-alanine cannot be produced by cyclosporin synthetase by epi- 
merisation of enzyme-bound L-alanine, as is the case for other peptide antibiotics whose biosynthesis mech- 
anism is known. 

Although attempts have been made to isolate and characterize cyclosporin synthetase in terms of its amino 
25 acid sequence, because of the complexity and size of the enzyme this has not to date been possible. Hence 
it has not been possible to characterize the DNA coding for cyclosporin synthetase. 

This invention provides a nucleotide sequence which codes for an enzyme possessing cyclosporin syn- 
thetase-like activity. In the present specification, an enzyme possessing cyclosporin synthetase-like activity 
is an enzyme which catalyses the peptide biosynthesis of cyclosporins and structurally related peptides and 
30 derivatives. 

Preferably, the nucleotide sequence codes for cyclosporin synthetase or an enzyme which is at least 70% 
(for example, at least 80, 90 or 95%) homologous to it and which possesses cyclosporin synthetase-like ac- 
tivity . 

Preferably, the nucleotide sequence codes for an enzyme which possesses cyclosporin synthetase-like 
35 activity and in which at least one amino acid recognition unit is different from that of cyclosporin synthetase. 

Preferably, the nucleotide sequence comprises the sequence represented in Seq Id 1 or a sequence which 
hybridises to it under conditions of reduced stringency or, more preferably stringent conditions. Stringent con- 
ditions include hybridisation at 42°C in 6 x SSPE, 50% formamide, 5 x Denhardf s solution, and 0.1% SDS and 
washing three times for 10 minutes in 2 x SSC, 0.1% SDS and twice for 30 minutes in 0.2 x SSC, 0.1% SDS 
40 at 65°C. Reduced stringency conditions include a washing temperature of 60°C. Even more preferably the nu- 
cleotide sequence codes for an enzyme having the amino acid sequence set out in Seq Id 2. The nucleotide 
sequence may have a restriction map as represented in figure 1. 

In another aspect, the invention provides a recombinant vector containing a nucleotide sequence as de- 
fined above. The vector may include the endogenous promoter for cyclosporin synthetase or may include some 
45 other suitable promoter. A suitable pro motor region is illustrated in Seq Id 7. The recombinant vector may be 
in the form of a plasmid, a cosmid, a P1 -vector or a YAC- vector. The invention also extends to host cells car- 
rying the vector. Preferably the host cell is a Tolypocladium niveum cell. 

The invention also provides a process for the production of cyclosporin or a cyclosporin derivative, com- 
prising cultivating a host cell as defined above and causing the host cell to produce the cyclosporin or cydo- 
50 sporin derivative. 

The invention also provides a method for the production of a cydosporin derivative, comprising altering 
the DNA sequence coding for cydosporin synthetase so that the enzyme causes the production of the cydo- 
sporin derivative, pladng the altered DNA sequence in a vector, transforming a host cell with the vector, and 
causing the host cell to produce the cydosporin derivative. Preferably the DNA sequence coding for cydosporin 
55 synthetase is altered by changing the fragm nts that code for amin add recogniti n units. Alterati ns may 
bemad using standard techniques such as thos based n PCR procedures. Point d letions, mutations and 
inserti ns, as w II as larger alterations are possibl . 

This specification describes the isolation and characterisation of the gen for cydosporin synthetase from 
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Totypocladium niveum and the use of the gene in genetically ngineering cells, including Totypodadium niveum 
cells. Whil a protocol for the isolati n of cyclosporin synth tase from Totypocladium niveum was published 
in 1990 (Law n and Zocher 1990), it is howev r not suitabl for extracting larg quantities f homog neous 
enzyme in ash rt period of tim . Also, in the puWicati n, the synth tase was attributed an M r f approximately 

5 650,000 Dalt n. It may, h w v r v justifiably be assumed from sedim ntati n analyses with flu resce nee- lab- 
elled protein (Lawen et al., 1992) and by extrapolation from the protein size of comparable enzymes that cy- 
closporin synthetase has an M r of approximately 1 ,500 kDa. The enzyme occurs as a single polypeptide chain 
and cannot be decomposed into subunits by either denaturing or reducing agents (Lawen and Zocher 1990). 
The enormous size of the enzyme means that a strategy for amino acid sequencing which differs from 

10 the customarily used route must be used. Substantially more homogeneous material is required than is gen- 
erally used to perform fragmentation tests. It is for this reason that a protocol was developed for cyclosporin 
synthetase which may, in principle, also be applied to analogous enzymes from other microorganisms and, in 
the practical example of the purification of the enzyme from Totypodadium niveum (example 1), gave rise to 
a substantial improvement in terms of yield and the amount of time required. 

15 Purification may initially proceed according to customary processes. Cell disruption may be performed, 
for example, with a high pressure homogeniseror a glass bead mill; the cells being present in moist or lyophil- 
ised state. If the cells are moist, pressure disruption is conveniently performed, for example with a Maunton 
Gaulin apparatus. Lyophilised cells are conveniently broken up by grinding in a mortar under liquid nitrogen. 
The crude extract so obtained is clarified by centrifugabon. The nucleic acids are removed by precipitating 

20 them from the extract using customary reagents for this purpose; polyethylene imine or protamine sulphate are, 
for example, used. The nucleic acid precipitation also removes fine suspended particles, which can disturb sub- 
sequent purification stages. Then the proteins may be precipitated out of the clarified crude extract to provide 
the enzyme in a more concentrated form. The protein precipitation is customarily performed with ammonium 
sulphate. For cyclosporin synthetase, saturation to 50% is sufficient to achieve almost complete precipitation. 

25 After this step, the enzyme is in an enriched and highly concentrated state. 

In principle, all chromatographic methods are suitable for further purification of the enzyme, such as ion- 
exchange chromatography and gel permeation chromatography. With very large proteins, gel permeation chro- 
matography is particularly suitable as a very selective purification step. If the correct molecular sieve is chosen, 
an approximately 90% homogeneous protein preparation may be obtained in a single step. Analysis of purity 

30 is performed in SDS polyacrylamide gels (preferably gradient gels 4-15%). 

The purification process used produces stable, at least 90% homogeneous, active enzyme preparations, 
as is necessary for characterisation of enzyme kinetics or protein chemistry. In Example 1, the protocol de- 
scribed in detail for Totypodadium niveum, in comparison with the published method, reduces the time required 
from 4 days to 10 hours and increases the yield by approximately a factor of 4. 

35 With a protein of this exceptional size, the requirement for amino acid sequences to identify the gene or 
gene product correctly is naturally greater than for an average-sized protein. Apart from the possibility of N- 
terminal blocking, it is also not possible to prepare a protein of this size in such a way that it is suitable for N- 
terminal sequencing. For these reasons, it is necessary to obtain a sufficient number of internal amino acid 
sequences. 

ao However, when a protein of this size is fragmented, so many fragments are produced (theoretically ap- 

proximately 700, assuming one cleavage every 20 amino acids) that the standard method of completely frag- 
menting the protein and purifying the fragments by high- pressure reversed- phase chromatography (HP-RPC) 
is not practicable. For this reason, fragmentation is performed under conditions which are sub-optimal for the 
relevant endoproteinases to give substantially larger fragments. 

45 Cyclosporin synthetase is cleaved by adjusting the pH value. In particular, cleavage into large fragments 
of u p to 200 kDa is achieved by adjusting the pH value to approximately 7.5 in a HEPES buffer with the addition 
of EDTA and DTT. The fragments obtained in this manner may be isolated and enriched as is conventional, for 
example by using chromatography and electrophoresis, such as the combination of anion exchange chroma- 
tography on MonoQ with HP-RPC or the combination of MonoQ with SDS-polyacrylamide gel electrophore- 

50 sis/electrobloL 

The sub-optimal conditions are principally obtained by altering the buffer conditions, and possibly also al- 
tering the cleavage temperature (see Example 3 as a possible variant). The nonetheless numerous fragments 
must each be isolated or enriched by 2 purification steps, it being in principle possible to use any chromato- 
graphic and eiectrophoretic separation techniques. In the case of cyclosporin synthetase fragments from Tot- 
55 ypociadium niveum, th c mbinations fani n xchang chromatography nM noQ with HP-RPC (Examples 
4 and 5) and M noQ with SDS-polyacrylamid gel lectrophoresis/electrobl t (Examples 4 and 6) prove par- 
ticularly advantageous. 

Then n-ribosomal biosynthetic pathway implies that the sequenc ofth cyclic peptid is determined by 
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the c rresponding arrangement of the amino acid activating domains. Each of these domains must perform 
analogous reactions, nam ly th activation of the amin acid by ad nylation and binding in th form of a thioe- 
ster. H nee it may b xpected that recurr nt, preserv d moieties will be f und in the prot in sequence. 
In fact, in previously analysed peptid synthetases, preserved regi ns within the sequences have been 
5 discov red, the number f which coincid swiththenumb rofamin acidst be activated: three f rACV syn- 
thetase (activates aminoadipic acid, cysteine and valine; Smith era/., 1990, MacCabe era/., 1991, Gutierrez 
et a/. f 1991); one each for gramicidine synthetase I (Kraetzschmar et a/., 1989) and tyrocidine synthetase I 
(Weckermann era/., 1988); and four preserved regions in gramicidine synthetase 2, which activates the amino 
acids proline, valine, ornithine and leucine (Turgay era/., 1992). 
10 Maximally accurate identification and characterisation of such preserved regions of cyclosporin synthetase 
at both the enzymatic and genetic levels constitutes the basis for well-directed genetic engineering in terms 
of altering enzyme specificity for the in vivo production of cyclosporin variants. It is therefore useful to identify 
proteolytic fragments of cyclosporin synthetase which may be correlated with a partial function of the synthe- 
tase. The following correlations were made: 
15 (1) a protein fragment with a methyl transferase function (the method on which this work is based is, in 
principle, applicable to all methyl transferases and is published in Yu et a/., 1983; a first application to cy- 
closporin synthetase is published in Lawen and Zocher 1990); see Example 7; 
(2) a protein fragment capable of activating L-alanine (Example 8). 

The method used in Example 8 exploits the fact that when proteins are subjected to limited proteolytic 
20 cleavage, inter alia intact domains are cleaved which, due to their correct spatial folding, are stai capable of 
exercising their enzyme function to a limited extent Theoretically, therefore, each amino acid activating domain 
may be identified with this method. The optimal conditions (for proteolytic cleavage and its timing in relation 
to amino acid activation) must, however, be determined by testing in each individual case. Moreover, unam- 
biguous identification of a domain may be achieved only if the amino acid it activates occurs only once in the 
25 product 

The gene is isolated by DNA hybridisation with oligonucleotides specific to cyclosporin synthetase (Ex- 
ample 10). Whether a specific DNA fragment actually belongs to the cyclosporin synthetase gene is estab- 
lished by Northern hybridisation, since a non-transcribed neighbouring fragment does not hybridise with the 
corresponding RNA (Example 1 5). The DNA sequence of the cloned DNA of the cyclosporin synthetase gene 
30 is determined and compared with the amino acid partial fragments of cyclosporin synthetase (Examples 13 
and 14). 

Hence it is possible to transform Toiypocladium niveum vAXh the complete gene for cyclosporin synthetase. 
Among the transfbrmants, strains may be found which contain several copies of this gene or copies with altered 
regulation. Those strains are selected which, in fermentation tests, display increased cyclosporin formation or 

35 can form the same quantity of cyclosporin over a shorter fermentation period. 

It is also possible to select the transformed strains by the activity of the cyclosporin synthetase, indepen- 
dently of whether cyclosporin is formed in greater quantities or faster. The isolated cyclosporin synthetase gene 
can act as an analytical aid in order to determine whet her a specific strain of Toiypocladium niveum has a high 
concentration of the mRNA or not (Example 15). Such strains may then be subjected to conventional muta- 

40 genesis and strain selection. Even if the initial strain used for transformation is not limited in its cyclosporin 
synthetase activity, a strain is provided in this way which potentially allows greater cyclosporin formation. The 
combination of classical genetics (mutation and strain selection) with molecular genetics (transformation with 
isolated genes) allows the isolation of improved strains which could not be achieved by either of the two meth- 
ods alone: not by classical genetics because a double mutation is extremely rare in a single selection stage; 

45 not by molecular genetics because in some circumstances an unknown factor has a limiting effect 

A further use of the isolated gene is gene-specific mutagenesis. Instead of producing mutations in the en- 
tire genome - and therefore also altering many uninvolved genes - the isolated gene alone is mutated using 
suitable methods (Sambroock et al., 1989) and then transformed to Toiypocladium niveum (Example 17). 
Among the transformants, the proportion of mutants in the cyclosporin synthetase gene is higher than with 

50 mutagenesis of the fungus. Mutants, which form specific cyclosporins in greater or reduced quantities, may 
more frequently be found than with conventional mutagenesis. 

By internal sequence comparisons of the derived amino acid sequences (Example 14c) and the correlation 
of specific partial sequences (Example 6 and Example 9 or Example 14ab), domains of the cyclosporin syn- 
thetase for the activation of the individual amino acids may be localised (as performed above for non-ribosomal 

55 peptid synthetases). By this means, well-directed mutagen sis of cycl sporin synthetase gen may b per- 
formed, by interchanging th gene region f individual domains, by deliberat ly removing a corresp nding re- 
gi n or th cyclosporin synthetase gene may also be ext nded by individual domains. After transformation of 
such mutated genes into Toiypocladium niveum, new cyclosporin variants may becom accessibl .The cloned 
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gene may be us d to produce strains of T lypocladium niveum which no I ng r have an activ cyclosporin syn- 
th tase g n . Such strains may b used for the production of D-aianin or Bmt by f rm ntation or act as re- 
cipient strains for in vitro modrf i d cyclosporin synthetas genes. To this nd, an inacuv v rsion produced in 
vitro is constructed for the transf rmation (Example 18). 

5 Wh n screening for microorganisms which can synthesis cycl sporins, it is necessary that the activ me- 

tabolites under test conditions are also actually formed in sufficient quantity. Such substances may moreover 
have slightly changed characteristics and may for this reason alone be overlooked. Example 16 describes the 
use of the isolated cyclosporin synthetase gene to find microorganisms which contain the cyclosporin synthe- 
tase gene in their genome. These genes do not have to be active for this purpose. On the basis of these hy- 

w bridisations, the corresponding genes may be isolated in a manner analogous to Examples 10, 11 and 12 and 
transformed into Tolypodadium niveum. A strain may be used to this end which no longer contains any active 
cyclosporin synthetase. This interspecific recombination cannot be achieved with other methods. As described 
in the preceding paragraph, such strains may be subjected to a screening programme. In this case, genetic 
variability is based on the introduced gene which hybridises with the cyclosporin synthetase gene. 

15 The control sequences of the cyclosporin synthetase gene may also be used for the construction of plas- 
mids. An example of a control sequence is that which occurs in synp4 (Example 12). The promoter may be 
fused with a readily detectable reporter gene, such as for example the ^-glucuronidase gene (Tada etal., 1991). 
Strains of Tolypodadium niveum which are transformed with these plasmids permit not only the selection of 
regulatory mutants, but moreover make it possible to measure and optimise promoter activity independently 

20 of other functions. 

The following examples and figures illustrate the invention without however, limiting it 
Figure 1: Restriction map of cyclosporin synthetase gene from Tolypodadium niveum cloned in XSYN3. 
The position of some restriction cleavage points is shown in relation to a scale (2.0, 4.0, 6.0, etc. kb). Among 
these, several partial fragments subcloned in plasmids are represented as rectangles (S5, E3, S3, etc.). If the 

25 corresponding rectangle is filled in, this means that the corresponding DMA fragment reacts with a high mo- 
lecular weight RIM A in Northern hybridisation (S5, E3, S3, E1, E2). Rectangles with lengthwise lines indicate 
that no bands were obtained in Northern hybridisation (E4, S2). Empty rectangles indicate that the DNA was 
not used as a probe (S4). The following two tables give the positions of the fragments (S5, H2, etc) and enzyme 
restriction sites shown in figure 1 (in bp): 
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Start 


End 


Fragment Name 
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2500 


S5 


1300 


3300 


H2 


2000 


5400 


E3 


2500 


5300 


S3 


4700 


11750 


H3 


5300 


8400 


S4 


5400 


7000 


E1 


7000 


9200 


E2 


9200 


12100 


E4 


10250 


13850 


S2 



50 
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Figure 2: Restriction map of plasmid pSIM10. The construction and structure of the plasmid is described 
in Example 18. The positions are stated in bp. Nucleotides 4749-6865 are DNAfrom Tolypocladium niveum 
containing the promoter of the cydophilin gene. Nucleotides 1-1761 contain the hygromycin phosphotransfer- 
ase gene from plasmid pCSN44 (Staben ef a/., 1989). Nucleotides 1761-4714 are from plasmid pGEM72f 
(Promega Inc.). 

Figure 3: Restriction map of plasmid pSIM11. Construction of the plasmid is described in Example 18. Nu- 
cleotides 4924 to 8553 are the 3.6 kb Xhol restriction fragment from the cyclosporin synthetase gene. Nucleo- 
tides 8548-10489 and 1-4929 are plasmid pSIM10 (figure 2). 

Figure 4: Restriction map of plasmid pSIM12. Construction of the plasmid is described in Example 18. Nu- 
cleotides 4924 to 5727 are the 0.8 kb Xhol restriction fragment from the cyclosporin synthetase gene. Nucleo- 
tides 5722-7663 and 1-4929 are plasmid pSIM10 (figure 2). 

Figure 5: Restriction map of cyclosporin synthetase gene from Tolypocladium niveum cloned in syncosl3. 
The position of some restriction cleavage points is shown. The position of the part cloned in Xsyn3 is marked 
with the Crosshatch ed bar. 

All the restriction maps shown in figures 1, 2, 3, 4 and 5are only approximate reproductions of restriction 
cleavage points in DNA molecules. The distances as drawn are proportional to the actual distances, but the 
actual distances may be different Not all restriction cleavage point are shown, it is possible for fur t her cleavage 
points to be present 



Example 1: Isolation of active cyclosporin synthetase in electrophoretically homogeneous form: 
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The starting material used for the protein purification is Tolypocladium niveum, strain 7939/45 (La wen et 
a!., 1989). All steps are performed at a temperature between 0° and 4°C. 1 0 g of lyophilised mycelium is finely 
ground in a mortar with addition of liquid nitrogen and then suspended in buffer A (buffer A: 0.2 M HEPES pH 
7.8, 0.3 M KCI, 4 mM EDTA, 40 (v/v)% glycerol, 10 mM DTT). The suspension is carefully stirred over ice for 
1 hour and then centrifuged for 10 min at 10,000 g to remove cell debris. 

The supernatant is collected and nucleic adds are precipitated with polyethyleneimine (final concentration 
0.1%). The precipitate is removed by centrifugation for 10 min at 10,000 g. 

The supernatant is again collected and proteins are precipitated using a solution of ammonium sulphate 
(saturated) in buffer B (0.1 M HEPES pH 7.8, 4 mM EDTA, 15 (v/v)% glycerol, 4 mM DTT) at room temperature. 
The solution is added dropwise to the supernatant up to a final concentration of 50% of saturation. The mixture 
is left to stand for a further 30 minutes to reach equilibrium. The precipitated proteins are collected by centri- 
fugation for 30 minutes at 30,000 g. The pellet obtained is resolubilised to 10 ml in buffer B. 

The resolubilised pellet is then subjected to molecular sieve chromatography. The molecular sieve is a 
HW65-F Fractogel obtained from Merck; the column dimensions are 2.6 cm x 93 cm, and the volume is 494 
ml. The column is operated under fast performance liquid chromatography (FPLC) conditions. The flow rate 
is 2 ml/min, continuous under buffer B. The cyclosporin synthetase elutes under these conditions at an elution 
volume of 260 to 310 ml. Processing 10 g of lyophilised mycelium produces 50 mg of cyclosporin synthetase 
in electrophoretically homogeneous form within 10 hours. 



Example 2: Detection of enzymatic activity of cyclosporin synthetase : 

80 uJ of an nzyme sample in buffer B are incubated, in a t tal volume of 130 \i\ t with 3.5 mM ATP, 8 mM 
MgCI 2 , 10 mM DTT, 10 uM C9 acid, 690 uM of any th r constitu nt amino acid and 100 uM S-ad n syl-me- 
thionine + 2 uCi of adenosyl-L-m thionine-S-ImethyMH] (75 Ci/mmol) for 1 hour at 22°C. Extraction and de- 
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taction of th cyclosporin A formed are perform d as described in Billich and 2 ch r 1987. 
Example 3: Endoproteinase cleavages: 

5 The following nd proteinases (Boeh ringer Mannh im, sequ ncing grade) are used: trypsin from bovin 

pancreas (cleaves after arginine and lysine); LysC from Lysobacter enzyrnogenes (cleaves after lysine); GluC 
= V8 from Staphylococcus aureus (cleaves after glutamic acid and aspartic acid). 

The cleavages are not performed under the conditions recommended by the manufacturer; but rather un- 
der 'sub-optimal 1 conditions. The cyclosporin synthetase is incubated in its storage buffer (0.1 M HEPES pH 

10 7.5, 4 mM EDTA, 4 mM DTT, 15 (w/v)% glycerol) with protease in a ratio of 100 ug : 1 ug for 2 to 3 hours at 
25°C. In this way, fragments of a size up to approximately 200 kDa are produced. 

Example 4: MonoQ purification of fragments: 

15 Purification is performed using a commercially available MonoQ column (HR 5/5) obtained from PHAR- 
MACIA, at 4°C. The protease digested protein sample is diluted (1:5) in buffer 1 (20 mM HEPES pH 7.5, 2 mM 
EDTA, 2 mM DTT, 5 w/v% glycerol) and applied to the column. The gradient elution of fragments is carried out 
in 20 ml of 0% to 100% buffer 2 (buffer 1 + 500 mM Nad). 

20 Example 5: HP-RPC purification of MonoQ fractions: 

Purification is performed using a commercially available NudeosD 300A-C4-5u column of dimensions 85 
x 4.5 mm. The MonoQ fraction sample is diluted (1:5) in buffer 1 (5% acetonitrile, 0.1% TFA) and applied at a 
flow rate of 1 ml/min and room temperature. Gradient elution is carried out in 85 minutes from 0% to 100% 
25 buffer 2 (90% acetonitrile, 0. 1 % TFA). 

Example 6: SDS-PAGE/Blot purification of MonoQ fractions: 

SDS-PAGE is performed according to Lammli (1970). Thioglycolic acid (2 mM) is added to the electro- 
30 phoresis buffer in order to prevent the N termini being blocked by residual radicals from the polymerisation 
reaction. The MonoQ fractions are used after denaturabon with SDS for the electrophoresis. For sequencing, 
the proteins are blotted out of the gel onto glass fibre membranes ("Glassybond" from Biometra) using the semi- 
dry method. 

35 Example 7: Protein fragment with methyl transferase activity: identification and purification 

The active centre of methyl transferases may be crosslinked with its substrate S-adenosyl-methionine by 
UV irradiation. This may be exploited by providing a radioactive substrate and so achieving radioactive labelling 
of the enzyme (Yu et al., 1983). This method, which is also known as "photoaffinity labelling", has been used 
ao on cyclosporin synthetase (Lawen and Zocher 1990) and it is possible to show that several labelled protein 
fragments are produced upon subsequent protease digestion. Alabelled fragment is enriched by a combination 
of the methods described in Examples 4 and 6 and so made accessible to sequencing (see Example 9: aa4). 
This fragment has a size of approximately 47,000 Dalton. 

45 Example 8: Amino acid activating protein fragments: identification and purification 

Protein fragments that have the capacity to activate an amino acid are identified by loading the synthetase 
with radioactively labelled amino acid in the simultaneous presence of an endoproteinase. Approximately 500 
ug of purified cyclosporin synthetase are incubated with 25 mM of ATP, 30 mM MgQ 2 and 5 uCi of 14 C-L-alanine 

so and are simultaneously treated with, for example, endoproteinase LysC. The reaction is arrested after 3 hours 
by precipitation of the proteins with TCA. The fragments are resolubilised in a sample buffer for SDS-PAGE, 
omitting reducing agents. Half of the batch is subjected to SDS-PAGE and the labelled protein fragment is de- 
tected by autoradiography of the gel after amplification in "amplify solution" (from NEN) and drying. A fragment 
with a M r of approximately 140,000 Dalton is identified and enriched by a combination of the methods described 

55 in Examples 4 and 6. The amino acid sequence is giv n in Example 9: aa13. 
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Exampl 9: Amin acid partial sequences f cyclosporin synth teg 



The following partial sequences ar btained from cyclosporin synth tas obtained from Exampl 6. 

aal: amino acids 1916 to 1942 of Seq Id 2 with amin acid 1921 being S and 1942 b ing I 

5 aa2: amino acids 2906 to 2925 of Seq Id 2 

aa3: amino acids 12240 to 12261 of Seq Id 2 with amino acid 12254 being E. 

aa4: amino acids 6535 to 6550 of Seq Id 2 

aa5: amino acids 12654 to 12671 of Seq Id 2 

aa6: amino acids 1099 to 1117 of Seq Id 2 with amino acids 1116 and 1117 being V and L 

10 aa8: amino acids 1984 to 1996 of Seq Id 2 with amino acid 1991 undeterminable. 

aa9: amino acids 13718 to 13738 of Seq Id 2 with amino acid 13731 undeterminable. 

aa10: amino acids 9611 to 9622 of Seq Id 2 

aa12: amino acids 11475 to 11484 of Seq Id 2 

aa13: amino acids 13601 to 13620 of Seq Id 2 

15 aa14: amino acids 9549 to 9568 of Seq Id 2 with amino acid 9565 undeterminable. 

aa1 5: amino acids 9504 to 9521 of Seq Id 2 

aa16: amino acids 13569 to 13586 of Seq Id 2 with amino acid 13568 being G 

aa17: amino adds 1020 to 1034 of Seq Id 2 

aa19: amino acids 9070 to 9084 of Seq Id 2 with amino acids 9082 and 9083 undeterminable 

20 aa20: amino acids 6532 to 6546 of Seq Id 2 with amino acid 6545 undeterminable 



Example 10: Isolation of X-dones which hybridise with an oligonucleotide specific to cyclosporin synthetase 

a) Construction of a genomic X-gene library from Tolypocladium niv&um. 

25 

DNA is isolated from the mycelium of a culture of Tolypocladium niveum grown in medium 1 [50 g/l of mal- 
tose, 10 g/l of casein peptone (digested with trypsin, Fluka), 5 g/l of KH2PO4 and 2.5 g/l of KCI; the pH value 
is adjusted to 5.6 with phosphoric acid]. 4 ml of a spore suspension of Tolypodadium niveum strain ATCC 34921 
with 4 x 10 s spores per ml are added to 200 ml of medium 1 in a 1 1 conical flask and are shaken for 72 hours 
do at 25°C and 250 rpm. The mycelium is filtered off with a Buchner funnel, washed with 10 mM of tris-CI pH 8.0, 
1 mM EDTA and ground to a fine powder under liquid nitrogen. Nuclei are isolated from 40 g of moist mycelial 
mass and are then lysed; the DNA is purified by CsCI-EtBr centrifugation. This method is described in Jofuku 
and Goldberg (1988). 4.3 mg of DNA are obtained, which, in a 0.5% agarose gel, produces a band exhibiting 
lower mobility than X-DNA. 

35 40 ng of the DNA are incubated with 1.4 units of the restriction enzyme Sau3A in 10 mM of tris-CI pH 7.5, 

10 mM MgCI 2 , 1 mM of DTE, 50 mM of NaCI for 60 minutes at 37°C and then 10 minutes at 65°C. The extent 
of cleavage is verified on an agarose gel: part of the DNA is between 10 and 20 kb in size. The DNA is then 
applied to two NaCI gradients, which are produced by freezing and slowly thawing at4°C two Beckman SW28.1 
ultracentrifuge micro tubes with 20% NaCI in TE (10 mM tris-CI, pH 8.0, 1 mM EDTA). The microtubes are cen- 

40 trifuged for 16 hours at 14,000 rpm in Beckman L8M ultracentrifuge in rotor SW28.1. The contents of the mi- 
crotubes are fractionated. Fractions with DNA larger than 10 kb are combined and dialysed against TE. After 
concentration of the DNA to 500 ng/ml, the DNA is combined with XEMBL3-DNA (Promega Inc.), previously 
cleaved with EcoRI and BamHI. 1.5 ug of the DNA and 1 ug of XEMBL3-DNA (cleaved with EcoRI and BamH\) 
are ligated for 16 hours at 16°C in 5 uJ of 30 mM tris-CI pH 7.5, 10 mM of MgCI 2 > 10 mM of DTE, and 2.5 mM 

45 ATP after the addition of 0.5 U of T4-DNA ligase (DNA concentration 500 jig/ml). The ligation mixture is pack- 
aged invhro with the assistance of protein extracts ("packaging mixes", Amersham). The X-lysates produced 
are titrated with £. coli KW251 (Promega Inc.). Approximately 4.5 x 10 5 pfu are obtained. 

b) Isolation of X-dones 

so 

40,000 recombinant phages from the Tolypocladium niveum gene library are cast with E. coli strain KW251 
onto 90 mm TB plates (TB contains 10 g/l of bacto tryptone and 5 g/l of NaCI and 0.7% of agarose, the pH is 
adjusted to 7.5 with NaOH). Two blots onto nitrocellulose (Stratagene) are made from each plate (Maniatis et 
a/. t 1982). From the amino acid sequence of the cyclosporin synthetase fragment aa9 (Example 9), an oligo- 
55 nucleotide mixture (96 different lig nucleotides, ach 20 nucleotides in length) with th sequ nces 



8 



EP0 578 616 A2 



5' GCA TCA ATA TTA AAT TGA TC 3' 
G G G G C G 

T 

may be produced on the basis of the genetic code. 1 .5 ug of this oligonucleotide mixture are incubated in 25ul 
of 50 mM tris-CI pH 9.5, 10 mM MgCI 2 , 5 mM DTE, 5% glycerol with 150 uCi y-ATP and 20 U of polynu- 
cleotide kinase (Boehringer) for 30 minutes at 37°C. Over 80% of the radioactivity is incorporated. Hybridisation 

10 is performed at 37°C in 400 ml 6 x SSPE (Maniatis era/., 1982), 5 x Denhardf s solution (Maniatis ef a/., 1982), 
0.1% SDS, 100 ug/ml denatured herring sperm DNA (Maniatis era/., 1982), 0.1 mM ATP, 1.4 x 10 6 cpm/mJ ^P- 
labelled oligonucleotide mixture for 16 hours. The filters are washed three times for 5 minutes and twice for 
30 minutes in 6 x SSC (Maniatis ef a/., 1 982) at 4°C. The filters are then washed for 1 0 minutes at 37°C in a 
TMAC (tetramethylammonium chloride) washing solution which is prepared according to Wood ef a/., 1985. 

is Finally, the filters are washed for 30 minutes at 57°C in the TMAC washing solution, dried and exposed for 10 
days with a Kodak Xomatik AR X-ray film. Regions of the agarose layer corresponding to positive signals on 
the X-ray film are punched out and resuspended in SM buffer (5.8 g/I NaCI, 2 g/l MgS0 4 x 7 hfeO and 50 mM 
tris-CI pH 7.5). A suitable dilution is again cast with KW251 onto a TB plate. The plaques are again transferred 
onto nitrocellulose. The DNA is isolated from plaques producing a positive hybridisation signal in the second 

20 hybridisation. The purified DNA from these phages is used for Southern hybridisations and restriction analyses. 
Figure 1 shows the restriction map of the Tolypocladium niveum proportion of such a X-clone (= XSYN3). Sub- 
cloning is performed in various plasmid vectors (for example pUC18, Pharmacia). 

To isolate X-clones containing the neighbouring DNA fragments ("chromosome walking"), the plaque hy- 
bridisation method described above is repeated a number of times; the marginal restriction fragments being 

25 used in each case as 32 P-labelled probes. In order to clone the DNA adjoining the region shown schematically 
in figure 1 (XSYN3), fragment S5 is used (figure 1). Hybridisation is then performed at 42°C in 6 x SSPE, 50% 
formamide, 5 x Denhardf s solution, 0.1% SDS, 100 ug/ml denatured herring sperm DNA, and 100 uM ATP. 
Before hybridisation, the ^P-la belled DNA is heated to 100°C for 5 minutes and cooled in ice. After 16 to 20 
hours, the filters are washed: three times for 10 minutes in 2 x SSC, 0.1% SDS and twice for 30 minutes in 

30 0.2 x SSC, 0. 1 % SDS at 65°C. The dried filters are autoradiographed . Those areas of the agarose correspond- 
ing to positive signals are further processed as described above. 

Example 11: Isolation of cosmid clones containing parts of the cyclosporin synthetase gene 

35 a) Construction of a genomic cosmid gene library from Tolypocladium niveum 

Protoplasts are produced as described in Example 17. Approximately 10 9 protoplasts are carefully lysed 
in 2 ml of TE (10 mM tris-HCI, 1 mM EDTA, pH 8.0). 0.1 mg/ml of RNase Aare added and incubation is continued 
for 20 minutes at 37°C. After the addition of 0.5% SDS and 0.1 mg/ml of proteinase K, incubation is continued 

40 for a further 40 minutes at 55°C. The batch is very carefully extracted twice with each of TE-saturated phenol, 
phenol/chloroform (1:1) and chloroform/isoamyl alcohol (24:1) (Maniatis ef a/. f 1982). The aqueous, slightly 
viscous supernatant is combined with one tenth its volume of 3 M sodium acetate (pH 5.2) and covered with 
a layer of 2.5 times its volume of absolute ethanol at -20°C and the DNA, found as fine threads at the phase 
interface, wound up using glass rods. The DNA is dissolved in 3 ml of TE for at least 20 hours. Depending on 

45 the quality of the protoplasts, approximately 500 ug/ml of DNA are obtained. Analysis with field inversion gel 
elecrophoresis (FIGE) (0.8% agarose, 0.5 x TBE (Maniatis ef a/., 1982), 6 V/cm, forwards pulse 0.2 to 3 sec, 
pulse ratio 3.0, running time 5 hours) gives a size greater than 150 kb. Two batches of 135 ug of DNA are 
cleaved with 7.5 and 15 units respectively of restriction enzyme Ndell (from Boehringer Mannheim) for 1 hour 
at 37°C in 1 ml of buffer (tris-acetate 33 mM, magnesium acetate 10 mM, potassium acetate 66 mM, DTT 0.5 

so mM, pH 7.9). Aliquots of the cleaved DNA are tested with FIGE and give a maximum size for the fragments 
obtained of approximately 45 and 30 kb respectively. 

Using a gradient mixer, linear NaCI density gradients from 30% to 5% in 3 mM EDTA pH 8.0 are produced 
in ultracentrifuge microtubes and the DNA fragments applied. After centrifugation for 5 hours at 37,000 rpm 
and 25°C (Beckman L7-65 ultracentrifuge, rotor SW41), the gradient is harvested in 500 uJ fractions. Fractions 

55 with DNA greater than 30 kb and less than 50 kb are dialysed three times for tw hours against TE (tris-HCI 
10 mM, EDTA 1 mM, pH 8.0), precipitat d with thanol and each dissolved in 50 ul TE. 

sCosI (from Stratag n ) is used as th cloning vector. Th vector arms cleaved with BamHl and XbaJ are 
produced and modified as stated by Evans t a/., (1989). 1 ug of th cleaved vector are ligated with approxi- 
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mately 500 ng of the DNA fragments in 20 uJ of ligation mix (tris-HCI 66 mM, MgC^ mM, DTE 1 mM, ATP 1 
mM, pH 7.5) with 16 units of T4-DNA ligase (from Boehringer) for 16 hours at 12°C. 4 uJ portions of th batch 
are packaged int lambda phage h ads with packaging xtracts (Gigapak, from Stratagen ). E. coll SRB (from 
Stratag n )isusedasth host strain for the infection and the bact riophag lambda-competent cells are pro- 

5 duced following th m t hod of Sambroock t al., (1 989). After inf ction, th batches are plat d in aliquots onto 
LB medium (Maniatis et al., 1982) with 75 ug/ml of ampicillin. Recombinant clones are discernible as colonies 
after 20 hours at 37°C. In total, approximately 50,000 colonies are obtained, which are then suspended in 0.9% 
NaCl/20% glycerol and stored at -70°C. Analysis of 40 randomly selected clones by isolation and restriction of 
the cosmids obtained shows that all the clones contain recombinant cosmids; the average insert size is 36 

10 kb. 

b) Isolation of cosmid clones 

The cosmid gene library is plated at a density of approximately 2500 colonies per 85 mm plate on LB mo- 
rs dium with 75 ug/ml of ampicillin (Maniatis et al., 1982). Transfer of each onto two nylon membranes (Duralon 
UV, Stratagene) is performed as described in Sambroock etai, (1989). The 1.6 kb Hindlll fragment from Xsyn3 
(see figure 1) is labelled with alpha-^P-dATP using "Random Primin g - (from Stratagene) and is used as a 
hybridisation probe. Prehybridisation is performed for 6 hours, hybridisation for 18 hours at 42°C in 5 x SSC, 
40% formamide, 5 x Denhardf s (Maniatis ef al., 1982), 0.1% SDS, 25 mM NaH^CU, pH 6.5, and 250 ug/ml 
20 of herring sperm DNA. The filters are washed twice for 10 minutes in 2 x SSC/0.1% SDS at room temperature 
and twice for 40 minutes in 1 x SSC/0.1% SDS at 60°C. The membranes are exposed for 14 hours on X-ray 
film (Kodak XomaticAR). Colonies having positive signals are purified, the corresponding cosmid-DNA isolated 
from the colonies and characterised by various restriction analyses and hybridisations with the labelled Xsyn3 
probes, and the vector-DNA sCosI . Figure 5 shows the restriction map of the cloned regions of such a cosmid, 
25 syncosl3; the Tolypodadium nh/eum DNA contained in it amounts to approximately 35 kb and also includes 
the region of Xsyn3. 

Example 12: Isolation of a P1 clone with the complete gene for cyclosporin synthetase 

30 Protoplasts are produced from Tolypodadium n 'rveum as described in Example 1 7 and suspended at a den- 
sity of lO^/ml in TPS. 1 ml portions of this suspension are mixed with 1 ml of 1.6% melted agarose (I ncert from 
FMC) held at 40°C and cast into small 1.5 mm thick blocks using a casting stand (BioRad). After solidifying, 
the blocks are transferred into lysis buffer (0.45 M EDTA pH 8.0, 1% N-lauroyl sarcosin, 1 mg/ml proteinase 
K) and incubated for 16 hours at 55°C. The blocks are washed for thrice for 2 hours in 0.5 M EDTA pH 8.0 

35 while being slowly rocked and are then stored at 4°C. Before being cleaved, the blocks are cut into small strips, 
transferred into Eppendorf microtubes and washed for four times for 2 hours and once for 16 hours in TE. The 
blocks are preincubated in four parallel batches at 4°C, each in 300 uJ BamHI buffer (from NEB), supplemented 
wit h 1 00 ug/ml of bovine serum albumin (from NEB) and 80 \ihA S-adenosylmet hionine, for 3 hours on ice. Then, 
2 units of BamHI (from NEB) and 16, 20, 24 or 28 units of BamHI methyiase (from NEB) are added to each 

40 batch and incubation is continued for a further 90 minutes on ice and then for 1 hour at 37°C. The reactions 
are arrested by the addition of 20 mM of EDTA and 0.5 mg/ml of proteinase K and incubated at 37°C for 30 
minutes. 

The blocks are applied to a 1 % agarose gel (Seaplaque GTG from FMC) and the DNA fragments separated 
by pulsed field gel electrophoresis ((Chef DR II from BioRad), 0.5 x TBE (Maniatis et al., 1982), switch interval 

45 of 8-16 sec, 150 V, 16 h, 12°C). 

The region of DNA fragments between 70 and 100 kb is cut out of the gel and the agarose hydrolysed 
with p-agarase (from NEB). The DNA solution obtained in this manner is very carefully extracted once with 
tris-saturated phenol and once with chloroform/isoamyl alcohol (24+1) and then concentrated to a final volume 
of approximately 100 \il by extraction with 1-butanol. 

50 pNS528tet14-Ad10-SacllB (from DuPont-NEN) is used as the cloning vector. The vector arms are pre- 
pared as stated in Pierce et al., (1992). Approximately 250 ng of the cleaved vector are ligated with approxi- 
mately 500 ng of the DNA fraction for 16 hours at 16°C (performed as in Example 11 , total volume 15 yX). After 
heating the ligation to 70°C for 10 minutes, 4uJ aliquots are cleaved with pacase (from DuPont-NEN) and pack- 
aged into bacteriophage P1 envelopes by addition of the "head/tail" extract, as described in Pierce and Stern- 

55 berg (1 991 ). After infection of E. coll NS3529, t h preparation is plated onto LB medium (Maniatis t al., 1 982) 
with 25 ug/ml kanamycin and 5% saccharose. R combinant clones becom visible aft r incubation of th 
plates at 37°C f r 20 h. 

In total, approximately 2000 colonies are btain d, which are stored as a pool in 0.9% NaCl/20% glycerol 
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at-70°C as "P1 library". 

Th g ne library (10 x 500 colonies) is sere n dasd scrib d in Example 11 (cosmid clones). Inter alia, 
a positiv clone is btained which contains all th fragm nts of the cosmid d ne syncosl3, tog ther with ad- 
ditionally a further approximat Iy30kbofth cyclosporin synth tas g n in the 5* dir ction. Hybrid isati n 
5 with dig nudeotid mixtures derived from suitabl amin acid sequ nces (see Exampl 9 and Example 10) 
shows that all the tested sequences are present on this P1 clone (synp4). In this way, it is ensured that the 
complete gene for cydosporin synthetase is contained on this clone synp4. 

Example 13; PNA partial sequence of the cyclosporin synthetase gene from Tolypocladium niveum 
10 ATCC34921 

r 

a) The DNAdoned as described in Examples 11 and 12 is sequenced and is illustrated as Seq Id 1. 

b) A polypeptide with the amino acid sequence illustrated as Seq Id 2 is be derived from this DNA. 

15 Example 14: Comparison of the amino acid sequences derived from the DNA with the cydosporin synthe- 
tase amino acid partial sequences 

The DNA of Seq Id 1 is translated on the basis of the genetic code into an amino acid sequence (i.e. position 
1 of the protein sequence corresponds to position 885 of the DNA sequence) and is compared wit h the amino 
20 add sequences given in Example 9: 

AA-Partial sequence 3: in Seq Id 2, position 12254 is T. Otherwise all amino acids correspond. 
AA-Partial sequence 4: all amino acids correspond. 
AA-Partial sequence 5: all amino acids correspond. 

AA-Partial sequence 9: in Seq Id 2 V position 13730 is W. Otherwise all amino acids correspond. (Position 13 
25 of the AA partial sequence aa9 could not be determined.) 
AA-Partial sequence 10: all amino acids correspond. 
AA-Partial sequence 12: all amino adds correspond. 
AA-Partial sequence 13: all amino acids correspond. 

AA-Partial sequence 14: in Seq Id 2, position 9565 is C. Otherwise all amino acids correspond. 
30 AA-Partial sequence 15: all amino acids correspond. 

AA-Partial sequence 1 6: Position 1 of the AA partial sequence aa1 6 does not correspond to t he AA sequence 
of Seq Id 2. Otherwise all amino acids correspond. 

AA-Partial sequence 19: in Seq Id 2, positions 9082 and 9083 are R and Y. Otherwise all amino adds corre- 
spond. 

35 AA-Partial sequence 20: in Seq Id 2, position 6545 is W. Otherwise all amino acids correspond. 

Further, internal comparison of the amino acids 13804-14063 of Seq Id 2 with amino adds 12304-12563 
of Seq Id 2 shows that 178 out of 259 amino acids are identical (68.7%). A further 28 amino acid residues 
(1 0.8%) are functionally similar. In total, 11 partial regions similar to each other may be identified in this manner. 

40 Example 15: Isolation of RNAfrom mycelium of Tolypocladium niveum&nd Northern hybridisation 

A 1 I conical flask with 100 ml of medium 4 (Dreyfuss et al., 1976) is inoculated with a spore suspension 
of Tolypocladium niveum ATCC34921 (1 x 1 0 7 spores/ml) and shaken for 96 hours at 250 rpm and 25°C. II con- 
ical flasks with 100 ml of medium 5 (Dreyfuss et al., 1976) are inoculated with 10 ml of this preculture and 

45 shaken for 7 days at 25°C and 250 rpm. The cydosporin A concentration is determined (Dreyfuss et al., 1 976) 
to be 100 tig/ml. 8 g of moist mycelial mass is filtered, washed with TE (10 mM tris-CI pH 7.5, 1 mM EDTA) 
and ground to a fine powder in a mortar under liquid nitrogen. RNA is then isolated according to the method 
described by Cathala et al., (1983). 4 mg of RNA are obtained, which are stored at -70°C. 10 ug of the RNA 
are separated on a denaturing 1.2% agarose gel containing 0.6 M formaldehyde. The electrophoresis buffer 

50 is 0.2 M MOPS, 50 mM sodium acetate, 1 0 mM EDTA, pH 7.0. The RNA is dissolved in a buffer mixed together 
from 0.72 ml formamide, 0.16 ml of 10 x concentrated electrophoresis buffer, 0,26 ml formaldehyde, 0.18 ml 
water and 0.10 ml glycerol. The samples are heated to 100°C for 2 minutes and separated at 115 V, 100 mA 
over 2 hours. The gel is shaken three times for 20 minutes in 1 0 x SSC, blotted onto Hybond N-Filter and fixed 
by UV treatment Hybridisation is performed at 42°C in 6 x SSPE, 50% formamide, 5 x Denhardfs solution, 

55 0.1% SDS, 100 jig/ml d natured herring sperm DNA, and 100 uM ATP. Th ^P-lab lied DNA (fragm nts of 
the cl ned DNAs described in Examples 9 to 12) are heated to 100°C f r 5 minutes and cool d in ice before 
hybridisati n. After 16 to 20 h urs, th filters are washed: three times for 10 minutes in 2 x SSC, 0.1% SDS 
and twice for 30 minutes in 0.2 x SSC, 0.1% SDS at 65°C. The dried filters are autoradiographed. If thefragm nt 
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used as the probe is a fragment of th cyclosporin synthetas g n , a band may be detected n the X-ray 
f 3m after 24 to 72 hours of autoradiography at -70°C. Th band exhibits distinctly less mobility than th larg st 
of the comparison RNA used (9500 b; RNA-ladd r, BRL). Figure 1 summarises the results of such hybridisa- 
ti ns:inrelati nt the restriction map of a X-clon , th isolation of which is described in Example 10, the pos- 
5 itions of individual restricti n fragments are giv n which were used as probes in Northern hybridisati ns. The 
fflled-in rectangles indicate that the bands described above may be detected (E2, E3, E1, S3, S5), while the 
rectangles with the transverse lines stand for those fragments which do not hybridise with such a band (E4, 
S2). (Fragment S4 was not used as a probe). 

10 Example 16: Identification of homologous synthetase genes 

100 ml of medium 1 (Dreyfuss et ai, 1976) are inoculated with 1x10 s fungal spores and shaken for 72 
hours at 25°C and 250 rpm. The mycelium is filtered out, washed with TE and lyophOised. 1 00 mg of lyophflised 
mycelium are added to 700 ul of lysis buffer (200 mM tris-CI pH 8.5, 250 mM NaCI, 25 mM EDTA, 0.5% SDS) 

15 and 100 mg of aluminium oxide powder (Sigma A2039) in an Eppendorf homogeniser and are homogenised. 
500 \i\ of phenol-chloroform are then added and vigorously mixed in. After 15 minutes centrifugation, the ex- 
traction is repeated. A volume of 3M sodium acetate pH 5.2 corresponding to 0.1 time the volume of the su- 
pernatant are added to the supernant and then a volume of i-propanol corresponding to 0.6 time the volume 
of the supernatant is thoroughly mixed in. After 5 minutes of centrifugation, the pellet is washed with 70% etha- 

20 nol, briefly dried and dissolved in 100 ul of TE with 100 ug/ml of RIMase and incubated for 15 minutes at 37°C. 
The phenol-chloroform extraction and ethanol precipitation are then repeated. The precipitated DNAis collect- 
ed. 

5 ul portions of the DNA are cleaved with Xho\ , separated on an agarose gel and blotted onto a nylon f flter. 
This filters are hybridised with ^P-labelled XSYN3 DNA as a probe. Hybridisation is performed under standard 
25 conditions, as described in Example 10 ("chromosome walking"). The hybridisations may, however, also be 
performed under less stringent conditions. 

The following hybridising bands are obtained with DNA from Tolypodadium niveum (all data are estimates 
due to mobility in the gel): 3.6 kb, 3.4 kb, 3.2 kb, 3.0 kb, 2.3 kb, 1.9 kb and 0.7 kb. DNA from Fusarium solani 
ATCC 46829 also displays bands at 3.6 kb, 3.4 kb, 1.9 kb and 0.7 kb together with a further band at approx- 
30 imately 2.1 kb. DNA from Neocosmospora vasinfecta ATCC 24402 also displays the bands at 3.6 kb, 3.4 kb, 
1.9 kb and 0.7 kb, together with two further bands at 2.9 kb and 1.8 kb. DNA from Tolypodadium geodes, Acre- 
monium sp. S42160/F, Paedlomyces sp. SB4-21622/F, Verticiltium sp. 85-22022/F (Dreyfuss, 1 986) each dis- 
play several hybridising bands in the range 0.7 kb to 7 kb. 

On the basis of the DNA sequence Seq Id 1, the following oligonucleotide pairs are be synthesised: 
35 Nucleotides 35073-35092 of Seq Id 1 

Nucleotides 37848-37829 of Seq Id 1 (complementary strand) 
or also 

Nucleotides 4030^40328 of Seq Id 1 
Nucleotides 42018-41999 of Seq Id 1 (complementary strand) 
40 If 50 ng of the Tolypodadium geodes CBS723.70 DNAis amplified with the first of the two oligonucleotide 

pairs described above (Sambroock et ai, 1989): 30 cycles: 1 min 30 sec 94°C; 2 min 30 sec 50°C; 6 minutes 
72°C, a 350 bp DNAis produced. If a part of this DNA is sequenced, the sequence given as Seq Id 3 is obtained. 
This DNA sequence is 75.1% homologous to the corresponding DNA sequence of Seq Id 1. 

Also, if 50 ng of the Neocosmospora vasinfecta ATCC 24402 DNA is amplified with the second of the two 
45 oligonucleotide pairs described above (Sambroock et a!., 1989): 30 cycles: 1 minutes 30 sec 94°C; 2 minutes 
30 sec 50°C; 6 minutes 72°C, a 1713 bp DNA is produced. If this DNA is sequenced, the sequence given as 
Seq Id 4 is obtained. This DNA sequence is 96.3% homologous to the corresponding DNA sequence of Seq 
Ml. 

so Example 17: Protoplastisation and transformation dfTolypodadium niveum 
a) Method 1: 

200 ml of medium 1 (maltose (monohydrate) 50 g/l, casein peptone, digested with trypsin (Fluka 70169) 
55 1 o g/l, KH 2 P0 4 5 g/l, KCI 2.5 g/l pH 5.6) in a conical flask are inoculated wit h 1 0 9 spores of Tolypodadium niveum 
and are incubated at 27°C, 250 rpm f r approximately 70 hours. 200 uJ of (0.1 %) 0-mercaptoethanol ar added 
and incubation continued for a furth r 16 h urs. The mycelium is harvested by centrifugati n (Beckman J2- 
21 centrifug , rot r JA14, 8000 rpm, 20°C, 5 minutes), washed in 40 ml of TPS (NaCI 0.6 M, KH 2 P(VNaH2P0 4 
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66 mM pH 6.2) and the pellet v lume measured by centrifugation in calibrated microtub s at 2000 g (in B ck- 
man GPR centrifug , GH3.7 rotor, 3000 rpm, 5 minut s). Th mycelium is suspend d in TPS (3 ml of TPS are 
used for ach 1 ml of pellet volum ) and the same volume of protoplastisation solution is add d (Novozym 234 
10nr>g/mltromNovolndustri,batchPPM-2415) i cytoh licas 5 mg/ml (from IBF), Zymolyase 20T 1 mg/ml (from 

5 S ikagakuKogy , batch n . 120491). Th suspensi n is incubated at 27°C at 80 rpm f r approximately 60 min- 
utes. The protoplasts are filtered through a milk filter, centrifuged out (700 g, 10 minutes) and taken up in a 
total of 4 ml of TPS. Each 1 ml of this suspension is layered on to 4 ml of 35% saccharose solution and is cen- 
trifuged at 600 g, 20°C for 20 minutes. The protoplast bands at the phase interface are drawn off, each diluted 
to 10 ml with TPS, centrifuged out, carefully resuspended in 200 \i\ portions of TPS and the suspensions are 

10 combined. For each 1 ml of pellet volume of starting mycelium (see above), approximately 2 x 10 8 protoplasts 
are obtained. 

The protoplast suspension is centrifuged out (700 g, 10 minutes) and suspended in 1 M sorbitol, 50 mM 
CaCI 2 at a density of 1 x 10 8 . 90 \x\ portions of this suspension are combined with 10 uJ of the vector DMA to 
be transformed, which contains the amdS gene from Aspergillus nidulans, for example plasmid p3SR2 (Hynes 

15 et a/., 1 983), (1 -1 0 \ig dissolved in tris-HC1 1 0 mM, EDTA 1 mM, pH 8.0) and 25 pi of PEG 6000-Lsg are added 
(25% PEG 6000, 50 mM CaCI 2 , 10 mM tris-HCI, pH 7.5, freshly prepared from the stock solutions: 60% PEG 
6000 (from BDH), 250 mM tris-HCI pH 7.5, 250 mM CaCIJ. The transformation batch is placed on ice for 20 
minutes and then a further 500 uJ of the mixed PEG 6000 solution are added and carefully mixed in. After 5 
minutes at room temperature, 1 ml of 0.9 M Nad, 50 mM CaCI 2 is added, the entire batch added to 7 ml of 

20 melted soft agar TMMAAC+N. held at 45°C, and cast onto preheated TMMAAC+IM plates. Medium TMMAAC+N 
contains 6 g/l glucose, 3 g/l KH 2 P0 4 , 0.5 g/I KCI, 0.4 g/l MgS0 4 x 7 H 2 O f 0.2 g/l CaCI 2 x 2 H 2 0, 8 mM acrylamide, 
2.1 g/l CsCI, 1 ml/I trace element solution, and 0.6 M NaCI. 15 g/l of Agar-Agar (Merck) are used for plates and 
7 g/l for soft agar. The trace element solution contains 1 mg/ml of FeS0 4 x 7 H 2 O f 9 mg/ml of ZnS0 4 x 7 H 2 O f 
0.4 mg/ml of CuS0 4 x 5 KfeO, 0.1 mg/ml of MnS0 4 x H 2 0, 0.1 mg/ml of H 3 B0 3 and 0.1 mg/ml of Na 2 Mo0 4 x 

25 H2O. Transformants are capable of using acrylamide as a source of nitrogen in the medium and may therefore 
be identified after approximately 3 weeks at 25°C as colonies against weak background growth. 

b) Method 2: 

30 Two portions each of 4.0 ml of the Tolypodadium niveum spores (ATCC 34921 ; 5 x 1 0^/ml) are introduced 
into a 1 1 conical flask with 200 ml of medium 1 (50 g/l maltose (monohydrate), 10 g/l casein peptone, digested 
with trypsin, FLUKA 701 69, 5 g/l KH 2 P0 4 , 2.5 g/l KCI, pH 5.6) and are shaken at 25°C at 250 rpm for 65 hours. 
The mycelium is filtered out over a sterile sintered porcelain filter with GMX nylon gauze and washed with TE 
(10 mM tris-CI pH 7.5, 1 mM EDTA) and resuspended in 40 ml of YG (5 g/l yeast extract, 20 g/l dextrose). Cen- 

35 trifugation is carried out at 900 g and 20°C for 5 minutes. The pellet is resuspended in YG (approximately 1 ml 
pellet in 5 ml) and 5 ml of protoplastisation solution are added to 5 ml of suspension. The protoplastisation sol- 
ution is produced from a solution containing 1.1 M KCI and 0.1 M citric acid. The pH is adjusted to 5.8 with 
KOH. Driselase (Sigma D9515) is added (15 mg/ml; storage at -20°C); the suspension remains in the ice for 
15 minutes and the starch carrier is removed by centrifugation for 5 minutes at 2000 rpm. Novozym (4 mg/ml) 

AO and bovine serum albumin (Sigma A7096, 20 mg/ml) are added. The solution is filtered through Millipore 
SLGV025LS and remains in the ice until used. The preparation is shaken at 37°C for 2.5 hours at 250 rpm. 
The preparation is filtered through a mOk filter. The protoplasts are centrifuged out (700 g; 20°C; 5 minutes) 
and carefully resuspended in STC (1 .2 M sorbitol, 50 mM CaCI 2 , 1 0 mM tris-HCI pH 7.5). 5 ml of 35% sacchar- 
ose solution are carefully covered with a layer of the suspension and centrifuged (600 g; 20°C; 20 minutes). 

45 The bands are drawn off and diluted to approximately 5 ml with STC. 2 x 10 8 protoplasts are obtained from 
200 ml of culture. 

50 uJ of the protoplast suspension (1 x lO^/rnl) are introduced into a sterile Eppendorf tube and 5 ug of 
plasmid DNA in TE and 12.5 uJ of PEG solution (20% PEG 4000, 50 mM CaCI 2 , 10 mM tris-HCI pH 7.5) are 
added. This solution is mixed from separately autocJaved stock solutions: 1 M CaCI 2 , 1 M tris-HCI pH 7.5, 60% 

50 PEG 4000 (Riedel de Haen). Once the mixture has stood for 20 minutes in ice, 0.5 ml of PEG solution are added 
and carefully mixed in. After 5 minutes at room temperature, 1 ml of 0.9 M NaCI, 50 mM CaCI 2 are carefully 
mixed in. The suspension is added to 10 ml of TM88 sorbitol soft agar (20 g/l malt extract, 4 g/l yeast extract, 
10 g/l bacto agar, 218 g/l sorbitol, pH 5.7) (45°C) and cast onto TM88 sorbitol plates (10 ml TM88 sorbitol agar 
20 g/l malt extract, 4 g/l yeast extract, 30 g/l bacto agar, 21 8 g/l sorbitol, pH 5.7). After 1 5 to 20 hours at 25°C, 

55 1 0 ml f TM88 sorbitol agar wit h 600 jig/ml of hygromy tin (45°C) are poured ov r. Hygromycin resistant trans- 
formants may be det cted after 7 days at 25°C. 
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Example 18: Construction f vectors pSIMIO, PSIM11 and pSIM12 and transformati n with these plasmids 

a) Isolation of cydophilin g n from Totypodadium niveum 

5 As described in Exampl 10, th Tolypocladium niveum gene library is screen d with a radioactively lab- 

elled DNA probe. Hybridisation is performed at 42°C in 6 x SSPE, 30% formamide, 5 x Denhardfs solution, 
0.1% SDS, 100 ug/ml denatured herring sperm DNA, and 100 uM ATP. ^P-labelled DNA (fragments of the 
DNA of the cydophilin gene from Neurospora crassa, Tropschug etal., 1988) are heated to 1 00°C for 5 minutes 
and cooled in ice before hybridisation. After 16 to 20 hours, the filters are washed three times for 10 minutes 

10 in 2 x SSC, 0.1% SDS and twice for 30 minutes in 1 x SSC, 0.1% SDS at 45°C. The dried filters are autora- 
diographed. The purified DNA from X-phages is subdoned in plasmids and characterised by restriction map- 
ping, Southern hybridisation and DNA sequencing. The cDNA sequence of Seq Id 5 is obtained. The sequence 
is homologous to the cydophilin gene of N. crassa. The start codon ATG is at positions 12-14 and the stop 
codon TAA is at positions 552-554. 

15 

b) Construction of vector pSIM1 0 and transformation with this plasmid 

On the basis of the Seq Id 5, a first oligonudeotide is synthesised which is largely complementary to Seq 
Id 5 (positions 2 to 29); however, the ATG region (12 to 14) is altered in such a way that a C/al deavage point 

20 (ATCGAT) is produced. A second oligonudeotide contains a sequence of the plasmid pUC1 8 and a recognition 
sequence for BamH\ and is given as Seq Id 6. 

A plasmid containing a 2.7 kb EcoR\-Hind\\\ fragment from Example 18a cloned into pUC18 is linearised 
with HindlM. 1 ng of the plasmid DNA is amplified with the oligonudeotides described above (Sambroock et 
aL, 1989): 30 cydes: 1 minutes 30 sec 94°C; 2 minutes 30 sec 50°C; 6 minutes 72°C. A2.1 kb DNA is produced. 

25 After chloroform extraction, this DNA is purified by ultrafiltration (Ultrafree MC 1 00 000; Millipore) and deaved 
in the appropriate buffer with the enzymes C/al and BamHI. 50 ng of this DNA are ligated with 50 ng of BamHI 
and C/al deaved DNA of the plasmid pGEM7Zf (Promega). The newly produced plasmid is cleaved with C/al 
and Xba\ and ligated with a Cla\-Xba\ restriction fragment 1.76 kb in size from the plasmid pCSN44 (Staben 
etal., 1989). A restriction map of this plasmid (pSIMIO) is reproduced in figure 3. 

30 The 2157 bp BamH\-Cla\ restriction fragment of the plasmid (4714-6865 in figure 3), which contains the 
cydophilin gene promoter, has the DNA sequence of Seq Id 7. 

The plasmid pSIMIO may be used for the transformation of Totypodadium niveum, as described in Exam- 
ple 17. DNA from the transformants is deaved with BamHI and, after electrophoresis, blotted on a nylon mem- 
brane. The 1.8 kb BgA\ fragment from pSIMIO (figure 3) is used as a radioactive probe. In this way, those of 

35 the transformants in which the plasmid pSIMIO has been incorporated once or a plurality of times into the gen- 
ome may be identified. 

The Xhol deavage point in plasmid pSIMIO (4924) allows the construction of plasmids which contain de- 
fined parts of the cyclosporin synthetase gene with which a deliberate inactivation of the cydosporin synthe- 
tase gene is possible: 

40 pSIM11 contains a 3.6 kb Xho\ restriction fragment (42285-45909 of Seq Id 1). If the plasmid linearised 

with EcoR V is used for the transformation, approximately 30% of transformants obtained no longer form cy- 
dosporin. It is shown with Southern hybridisations with DNA from such transformants that an 8.4 kb Xbal frag- 
ment is no longer detectable, but instead two new restriction fragments with 10.6 kb and 8.2 kb are detected. 
pSIM12 contains a 0.8 kb Xnol restriction fragment (39663-40461 of Seq Id 1). If the plasmid linearised 

45 with Sail is used for the transformation, approximately 30% of transformants obtained no longer form cydo- 
sporin. It is shown with Southern hybridisations with DNA from such transformants than an 8.4 kb Xbal frag- 
ment is no longer detectable, but instead two new restriction fragments with 10.4 kb and 5.6 kb are detected. 

Example 19: Cotransformation with synp4 

50 

pSIM1 0 (Example 1 8) is used as transformation vector. Together with this vector, equimolar quantities of 
synp4 (Example 12) are also used in the same transformation batch. These co transformations are performed 
according to the method described in Example 1 7 and Totypodadium niveum ATCC 34921 is used as the start- 
ing strain. 

55 G n mic DNA from hygromydn resistant transformants is isolated according to a rapid method. T this 

end, mycelium is tak n from an area f approximately 1 cm 2 f th corresponding col ny and transferred into 
Eppendorf homogenisers. 1 ml lysis buffer (50 mM EDTA, 0.2% SDS) and 100 mg aluminium oxid (grad A5, 
from Sigma) are added and th roughly homogenis d for approximately 5 minutes. After centrifugation (5 min- 
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utes, 11,000rpm)th sup rnatant is extract d nee with each of tris-saturated phen I, ph n l/chl rof rm(1:1) 
and chloroform/isoamyt alcohol (24:1) and th DNA precipitated with isopropanol using th standard procedure 
(Sambroock taL, 1989). 

The DNA is compl tely restricted with the restriction enzyme Sa/I, separated with g I lectroph resisand 
5 investigated in South rn hybridisati ns.Th 0.8% agarose g I is transferred by vacuum blotting (Vacu blot, from 
Pharmacia) onto a nylon membrane (Duralon-UV from Stratagene) and fixed with UV. 

As probe for the hybridisations, the small Spe/ restriction fragment from the bacteriophage P1 vector pNS- 
528tetJ4-Ad10-SacllB (from DuPont-NEN) is prepared by gel electrophoresis and Genedean II Kit (from 
BIO101) and radioactively labelled with alpha ^P dATP by "random primer" synthesis (from Stratagene). 
w Prehybridisation is performed for approximately 8 to 16 hours at 42° C in 6 x SSC, 50% formamkJe, 5 x 
Denhardfs (Maniatis et a/., 1982), 0.1% SDS, 0.25 mg/ml denatured herring sperm DNA, and 25 mM NaH2P0 4 
pH 6.5 in a volume of 1 0 ml per 100 cm 2 of membrane. After addition of the labelled probe, incubation is con- 
tinued for a further 16 to 20 hours at 42°C. The blot is washed twice for 10 minutes with 2 x SSC/0.1% SDS 
at 25°C and twice for 30 minutes with 0.5 x SSC/0.1% SDS at 60°C. After autoradiography for approximately 
15 48 to 96 hours at -70°C with Kodak intensifying film onto X-ray film (Xomatic AR, from Kodak), bands become 
visible on the X-ray film. 

Some of the investigated DNAs display hybridisation signals which are attributable to the integration of 
synp4. The number of signals, which should correlate with the number of integrated synp4 molecules, varies 
between 1 and 3. 

20 A transfer mant strain verified in this manner is investigated for cyclosporin A formation by test fermentation 

in a shaking flask as described by Dreyfuss ef a/. (1976). Whilst approximately 100 ug/ml of cyclosporin A is 
formed in parallel tests of the untransformed starting strain Totypocladium niveum ATCC 34921, approximately 
1 50 jig/ml of cyclosporin A is detected in tests with the strain in which additional copies of the cyclosporin syn- 
thetase gene are present due to the integration of synp4. 

25 

Abbreviations used: 





ACV 


aminoadipyl-cysteinyl-vali ne 




amdS 


acetamidase gene 


30 


ATCC 


American Type Culture Collection 




ATP 


adenosine triphosphate 




bp 


base pairs 




CBS 


Centraal bureau voor Schimmelcultures 




DTE 


dtthioerythritol 


35 


DTT 


dithiothreitol 




EDTA 


ethylenediaminetetraacetic acid 




HEPES 


N-2-hydroxyet hyl- pi perazine-N-2-propanesul phonic acid 




MOPS 


3-morpholinepropanesul phonic acid 




PEG 


polyethylene glycol 


40 


pfu 


plaque forming units 




SDS 


sodium dodecyl sulphate 




SDS-PAGE 


SDS-polyacrylamide gel electrophoresis 




SSC 


150 mM NaCI, 15 mM sodium citrate, pH 7.0 




SSPE 


160 mM NaCi, 10 mM sodium phosphate, 1 mM EDTA, pH 7.7 


45 


TE 


10 mM tris-CI pH 7.5, 1 mM EDTA 




TFA 


trif luoroacetic acid 




bis 


tris(hydroxymet hyl)aminomet hane 




YAC 


yeast artificial chromosome 



Moreover, the customary abbreviations for the restriction endonucleases are used (Sau3A, H/ndlll, EcoRI, 
50 H/ndlll, C/al etc.; Maniatis ef a/., 1 982). The nucleotide abbreviations A, T, C, G are used for DNA sequences 
and the amino acid abbreviations (Arg, Asn, Asp, Cys etc.; or R, N, D, C etc.) for polypeptides (Sambroock et 
al., 1989). 
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SEQUENCE LISTING 



10 



25 



(1) GENERAL INFORMATION: 



<i) APPLICANT: 

(A) NAME : Sandoz Ltd 

(B) STREET: Lichtstrasse 35 

(C) CITY: Basel 

(E) COUNTRY: Switzerland 

(F) POSTAL CODE (ZIP) : CH-4002 

(G) TELEPHONE: 41-61-324 4395 

(H) TELEFAX: 41-61-322 7532 

15 (A) NAME: S a ndoz -Patent -GmbH 

(B) STREET: Humboldstr. 3 

(C) CITY: Loerrach 

(E) COUNTRY: Germany 

(F) POSTAL CODE (ZIP) : D-7850 

(A) NAME: Sandoz-Erf indungen Verwaltungsgesellschaf t 
20 mbH 

<B) STREET: Brunnerstr 59 
<C) CITY: Vienna 

(E) COUNTRY: Austria 

(F) POSTAL CODE (ZIP) : A-1235 



(ii) TITLE OF INVENTION: Cyclosporin Synthetase 
(iii) NUMBER OF SEQUENCES: 7 



(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

30 (C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 46899 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) ■ 
40 (iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tolypocladium niveum 
^ (B) STRAIN: ATCC 34921 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GAATTCAGTA TCGGGCAAAT CTTCATGGTG ATGTGAATCT AGCGAGATGA ATGCAGGAGA 60 

SO ATCGGCTGGG ATGGCCTCCA GATATACACC CTTCTAGCAT CACAAATCCC GCCGATGTAC 120 

AAGCCCCACG ACGAACGTTC TTATTGGCTT AACCGCTACT AGTATTTTTA TATAGTAGTT 180 

TATATGCGTA GGTACTCTCT TCTGTTAATG TCAGAGGATC TATTGCGATG GGCAGGCTGC 240 



55 
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AGCAATGCCT CGATCTTGAT GGAGGGATAG 

5 TATTAGTAAC TCTATGCTTG TTTTAAGGTA 

TAAGCCACGT GGTCCACAGT CTGACGAAGT 

AATACGGAGT AAAGGAGTAG TATCATAGCT 

10 CCCCTTGGCT GTCAGATTAC CTTACAAGTC 

» 

TCCTTCAGTC GCT TACT ATT TACTGGAACA 
CCAACAAAAA CTCAGGAGAT CCACTCTTTA 
CCGTGCAACG ATACTGTCGG AAAGCTCGAC 

15 

ACAAAGCCGG ACTCGCCACA ACTCAGCAAC 
TACACGAACT TCATGAGATG GATTGTACAT 
ACAACC AT TG CCAGATTATA GAGCCTTTTG 

20 AAGACATGGC ATATGATCGC CTTGCCAACC 

GATACTCCGA ACCTGTCGAG CAATCCTTTG 
TGAAGCTCGG TGCGAGCTGG GACATTACGC 

25 ACATCGATGC GCTGAACGCT GCCTCGCGCG 

CGACGTTCAA GGAGCAGGAT GGCGTGGGCG 
GAGGGTTGAG GATTGTTGAT GCCTCGAGCC 
AAACCATGAA GTTCGACCTA GAGTCTGAGC 

30 

CCGAGGATCA CCATATTCTT TCCATTGTTG 
TCGACATTAT TCAGCAGGAG CTTGGAGAAC 
TTTCGGCTTG TCCCTTGGGT CCAATTCCCA 

35 

ACCAGGACGA GCAGGTCGCT GAGCAGGAAA 
ATAACAACAC ACCGGCCGAG CTCCTCACAG 
AAACTGGCAA GATCTCCTTC CAGATCGATG 

40 GCCGCTCCCA GCAAGTAACC GCCTACGCCG 

TTCGCCTCAC TGGAGCCGAG GATGCAACCA 
CGGAGCTGGA GAACATGGTG GCTCCCTTGG 

^ ACGAGGACGA CACCTTCGAG TCGGTGCTGC 

ATGCCAACCG CGACGTCCCC TTTGAGCGCA 
ACACATCACG ACACCCGCTT GTGCAGCTCA 
GCCGAGCCCG GTGGGGGTTC CTCGAGGCTG 

50 

TCGACATGGA GATGCACCTG TTTGAGGGAG 
CCACGGGCCT TTTCGACGCA GAGGCCATCC 
TGCGCCGTGG CATCTCGGAG CCTGCGGTGC 

55 



TTGTTTGCTG ATGAGTATAG GTACTTATTC 300 

CCGATACTCG TACGTCGATC GTGGGGGGTG 360 

TTCGAACCCT TCAGGGATTA TTAACAAGGT 420 

TGGAATATGT GGAAACCCCG AGGAGGCAAT 480 

TCCATCTACT GACCACGAAC TGAACTCAGT 540 

TCTCCTCGAA TTTGGAAAAA GAAAAAAGCA 600 

TCGGACACAA ATAGCTACTT GCTTTCTGTG 660 

CTACGAGCCA CTTACACCTG TGGTAGCAGC 720 

TAGCCATTCG AAATCGCAAA CTACAGCAGC 780 

ACTGACTACA CTAGGTTTAC TAACAGATAG 840 

CTTTCTTGGT CAACATGGGC GCCATCGGGC 900 

CGTCTCGGGC GAGTTCCATC TCTTCGAACC 960 

CCCAGGGCAG ACTGTGGTTC CTGCACCAGC 1020 

CGGCCGCGAT CCGACTTCGG GGCCATCTCG 1080 

CTCTGACGCA GCGCCACGAG ACGCTCCGAA 1140 

TACAGGTTGT GCACGCCTCG GGCCTCGAAA 1200 

GCGATTTGGC CCAGCTCCTG GCAGAGGAAC 1260 

CAGCTTGGAG AGTTGCATTG TTGAAGGTGG 1320 

TACACCATAT CATCTCAGAC AGCCGGTCTC 1380 

TCTACACGGC CGCCTCGCAG GGGAAATCGA 1440 

TTCAATACCG TGACTTGACG ACTTGGCAGA 1500 

GGCAGCTCGG ATACTGGATC GAGCAGCTCG 1560 

AGCTTCCCCG GCCAGCTATC CCATCTGGCG 1620 

GATCGGTACA CAAAGAACTC CTGGCCTTCT 1680 

TGCTGCTGGC AGCGTTTCGC GTGGCGCACT 1740 

TCGGAGCGCC CGTTGCCAAC CGCGACCGGC 1800 

CCACTCTGCA GTGCATGCGA GTCGTGCTCG 1860 

GGCAGATCAT GTCCGTCATG ACAGAGGCAC 1920 

TCGTGTCTGC GTTGCTGCCC GGGTCGACAG 1980 

TGTTTGCTTT GCATCCCGCG CAGGATACGG 2040 

AGACTCTGCA GAGTGCGGCC CCGACACGAT 2100 

ACGACCGGTT CGATGCAAAC GTGCTGTTCT 2160 

GCAGCGTGGT TTCTATCTTT CGGGAAGTCC 2220 

ATGTGAAGAC GATGCCGCTC ACCGATGGGC 2280 
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TCGCCGCGAT CCGGGACATG GGCTTGCTGG 

5 

CGAGCGTGGT TGATATGTTC CAAGAGCAGG 
CCGATGCTTC GTCCAGATTG AGCTACTCTG 
CGTGGCTGCG CAGACGGCAG CTCAAGCCCG 

10 

CTTGCGAGAC CATGGTTTCC TTCCTCGGTA 
TCGATATCAA CGTTCCCTTG GCACGCATCG 
AGCTCGTCCT GCTTGGGAGC AACGTGCCCC 

15 TGCTGCGGAT CAGCGATGCC CTGAACGGGT 

CGACTGCAAA GCCCTCGGCG ACGGACCTGG 
GCAAGCCGAA GGGTGTCATG ATCGAGCATC 

^ ACATTATTTC GCCCGCCCAG GCAGCAGTGC 
ACCTCTCAAC ATGGGAGATC TATACCCCTA 
AACACTCTGT CACGCTAGAT AGCAAGGCAC 
GTGTGGCCTT CCTTGCTCCT GCTCTGATCA 

25 

TTGCGGGCCT GGATAGCCTG TACGCTATTG 
ATGCAAAGTC CTTGG TGAAG CATGGCGTTT 
TCGTCAGTAC CATCTACAGC GTCTCCGAGG 

30 GCCGGGCCAT CAGCAACTCG GGCGCCTATG 
CCGGGGTGAT GGGGGAGCTT GTGGTTTCTG 
CGGCTCTGGA TAAGAACCGA TTTGTCGTGG 

35 ATCGTACGGG AGACCGGGCC CGATACAGCC 
GCATGGATCA GCAGGTCAAG ATCCGTGGCC 
CTTTACTCAA CAGCGACCAA GTACGCGATG 
AAGAGCC TGC GATGATTGCC TTCGTTACGA 

40 

ACATCAACGG CAACGGCCAC GTTCCCGACG 
TTCACGTCGA GAGCGAACTG CGCCGGCGCT 
CGGCCCGCAT CGTGGTGCTT GACCATCTCC 

45 AGGCGCTGGG TCAGTCGGCC AAGACTGTGC 

CCCCACGCAA TGAGATCGAG GCCGTGCTTT 
AGGTTGGCAT CACCGATAAC TTCTTCGACC 

50 TCGCGGCACG GATCAGCCAG AGGCTCGACA 
AGCCGATGCT CGCTGACCTC GCCGCCACGA 
TCCCTACGAC AGAATACACG GGACCGGTGG 

55 



ATATCGGGAC 


CACCGACTAC 


CCCCGCGAGG 


2340 


TGGCCTTGAA 


TCCAAGCGCC 


ACCGCCGTGG 


2400 


AGTTGGATCA 


CAAGTCAGAT 


CAGCTGGCCG 


2460 


AGACCTTGAT 


TGGCGTGTTG 


TCTCCTCCGT 


2520 


TCCTCAAGGC 


TCATCTGGCT 


TATCTGCCTC 


2580 


AATCAATCCT 


TTCGGCCGTG 


GACGGGCACA 


2640 


AACCCAAGGT 


GGATGTACCC 


GATGTTGAGT 


2700 


CTCAGGTGAA 


TGGGCTTGCA 


GGGAAACAGG 


27 60 


CCTACGTCAT 


CTTCACCTCG 


GGATCGACTG 


2820 


GGGGCATCGT 


ACGCCTCGTG 


AAAGGAACAA 


2860 


CGACAGCTCA 


CCTGGCCAAC 


ATCGCTTTCG 


2940 


TCCTTAATGG 


CGGCACTCTT 


GTCTGTATCG 


3000 


TAGAAGCTGT 


ATTCACCAAG 


GAGGGCATTC 


3060 


AGCAGTGTCT 


CGCCGACAGA 


CCGGCGATCT 


3120 


GCGATCGCTT 


CGACCGACGT 


GACGCCCTCC 


3180 


ATAATGCCTA 


TGGTCCAACC 


GAGAAT TCCG 


3240 


CTTCACCGTT 


TGTCACGGGG 


GTGCCCGTTG 


3300 


TAATGGATCA 


GGATCAGCAA 


TTGGTCTCTC 


3360 


GAGATGGCCT 


AGCTCGAGGA 


TATACCGATT 


3420 


TGCAGATTGA 


CGGCGAGTCA 


ATCCGGGGCT 


3480 


TCAAGGGTGG 


CCAGATTGAG 


TTCTTTGGCC 


3540 


ATCGTATCGA 


GCCAGCCGAG 


GTAGAGCACG 


3600 


CAGCAGTGGT 


TATCCGGAGA 


CAGGAGGAGG 


3660 


CGCAGGGTAC 


GCTCCCTGAT 


CACCTCGTCA 


3720 


GCAACGGCAG 


CAAGAACGAC 


CAATTCGCCG 


3780 


TGCAGATGTT 


GCTGCCCTCC 


TACATGATGC 


3840 


CTCTCAACCC 


CAACGGCAAA 


GTCGACCGGA 


3900 


AGAAGAGCAA 


GCTGGTCTCA 


CAGCGCGTCG 


3960 


GCGAGGAGTA 


CAGGAGTGTG 


CTTGGTGTCG 


4020 


TGGGTGGTCA 


TTCCTTGACG 


GCCATGAAGC 


4080 


TTCAAGCATC 


CGTAGCAACT 


GTCTTTGAGC 


4140 


TCCAGCGCGG 


CTCGACTCTG 


TATAGCGTCA 


4200 


AGCAATCATT 


TGCCCAAGGC 


CGTCTGTGGT 


4260 
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TCCTTGAGCA GCTGAATACC GGCGCCTCAT 
5 GAGGCCACCT CGACGTGGAT GCGCTGGGAA 

AGACTCTTCG GACAACCTTT GAGGAACGGG 
GCCTCATGGG GGAGCTGCGG CTGATTGATA 
AGGCACTGAT GAAGGAGCAG TCAACCCGCT 

10 

TGGCGCTGCT GAAGTTGGCA GACCACCACA 
CGGATGGATG GTCTCTCGAC CTCCTACGAC 
TGCGCGGCCA GGACCCATTG TCGCGCCTTG 

15 CGGTCTGGCA GAAGCAAGAC AGCCAGCAGA 

GGACCAAGCA GCTTGCAGAC AGCACGCCTG 
CGATTCTATC CGGAAAGGCT GGAAAGGTCC 

20 CGCTTCAAGT CTTCAGCCGC ACCCATCAAG 

TCCGTGCAGC ACATTTCCGG CTTACGGGAT 
CGAACCGGAA TCGACCTGAG CTTGAGAACG 

25 TACGTATCAC GATCGATGAA AACGATAACT 

CGACTACAGC CGCACAGGAC AATCAGGATG 
TGCCGAGCAG CTCGAGAGAT GCATCCCGGA 
ACGGCCAGCA GGATCTGTTC AAGATCCAAC 

30 

CAGAAGAAGT GACGAGGTTC GACATCGAGT 
GCGGTGATAT CATATTCGCT GCCGACTTAT 
GCGTCTTTCA GGAGGTTCTG AGGCGCGGAT 

35 TGCCACTCAC CGACGGCATT CCAGAGTTGG 

CCGACTACCC CCGCAACATG TCTGTGGTAG 
CCGAGGCTAC AGCTGTTATC GACTCATCTT 

40 GGTCCGATCA GGTGGCAGCG TGGCTTCGCC 

CAGTGCTCGC ACCACGCTCG TGCGAGGCCG 
GTCATGCCTA CCTACCGCTC GACGTCAATG 
CCGAGGTGAA GGGCGAGAAG CTGGTTCTCC 

45 

AGTCGCCAGA GGTCTCGATC GTGAGGATTG 
GCTTGCGTGA TGGCAAGTCC AAGCCAACCG 
CCGGATCCAC TGGTAAACCC AAGGGTGTGA 

50 

TGAAGCAGAC CAACATTCTA TCCAGTCTAC 

TGTCCAACCT TGCGTTCGAT GCATCGATAT 

GCTCTCTTGT ATGCATTGAC AGGTTTACCA 
55 



GGTATAATGT GATGCTCACC GTACGACTAC 4320 

CGGCCCTGCT CGCCCTGGAG AAACGGCACG 4380 

ACGGGGTTGG CATGCAGGTA GTCCACAGCA 4440 

TATCAGAGAA ATCTGGCACT GCCGCGCATG 4500 

TCGACCTGAC TCGCGAGCCA GGTTGGAGAG 4560 

TCTTCTCGAT CGTCATGCAC CACATTGTAT 4620 

ACGAGCTGGG CCAACTCTAC TCGGCAGCTC 4680 

AGCCACTCCC GATCCAATAC CGCGACTTTG 4740 

AAGCAGCGCA CCAGAGGCAA TTGGAGTACT 4800 

CAGAGCTCTT GACAGACTTC CCGCGGCCCT 4860 

CCGTTGCCAT CGAGGGGTCT CTATACGACA 4920 

TCACGTCGTT TGCTGTCCTA CTCGCAGCCT 4980 

CTGATAATGC GACTATTGGT GTCCCCAGCG 5040 

TGATCGGCTT CTTCGTGAAC ACACAATGTA 5100 

TTGAATCGTT GGTCCGGCAG GTCCGGTCGA 5160 

TCCCGTTCGA ACAGGTCGTT TCCAGCCTCA 5220 

ACCCTCTGGT GCAGCTCATG TTTGCACTGC 5280 

TGGAAGGGAC CGAAGAGGAG GTGATCCCAA 5340 

TCCATCTCTA CCAAGGCGCC AGCAAGCTGA 5400 

TCGAAGCCGA AACTATTCGT GGCGTCGTCA 5460 

TGCAACAGCC GCAGACCCCG ATCATGACAA 5520 

AGAGGATGGG CTTGTTGCAC ATGGTCAAGA 5580 

ACGTATTCCA ACAACAAGTT CGTCTCAGCG 5640 

CGCGGATGAG TTACGCCGAA CTGGACCAGA 5700 

AGCGACAACT GCCAGCCGAA ACCTTTGTGG 5760 

TCATTGCTCT CTTCGGCATC TTGAAGGCTG 5820 

TGCCAGCAGC GCGTCTTCGC GCCATCTTGG 5880 

TAGGAGCAGG TGAGCCATCA CCGGAAGGCC 5940 

CCGATGCCAC GAGCCCTGCT GGCCATGCCA 6000 

CAGGCAGCCT CGCCTATGTC ATCTTCACTT 6060 

TGATCGAGCA CCGCGGAGTC TTGCGCCTTG 6120 

CGCCGGCGCA GACCTTCCGA ATGGCTCACA 6180 

GGGAGGTCTT CACGGCCCTT CTCAACGGAG 6240 

TCTTGGATGC TCAAGCGTTG GAGGCACTAT 6300 
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TCCTCAGGGA GC AC AT CAAT ATTGCACTGT 

5 

CGGATGCAGC TGCTACCATC AAGTCTCTTG 
ACACAGCGGA CGCAGCTCTG GCCAAAGCTC 
GCCCAACGGA AAATACGGTC ATGAGCACTT 

10 

TTAATGGTGT GCCAATTGGA AGAGCCGTTA 
ATCAGCAGCT TGTGCCGTTG GGCGTGATGG 
CTCGTGGCTA CACCAACCCG GCTCTTGATT 
15 GCCAACTTCT CAGGGCCTAT CGCACAGGCG 

AGGTTGAGTT CTTTGGTCGG ATGGATCACC 
TCGCCGAAGT AGAACACGCT TTGTTAAGCA 
CAAACTCGCA GGAAGACAAT CAGGGAGTCG 

20 

ACGAGAC TCT CCAGGAAGCA CAGTCGAGCA 
AGACCACGGC CTACGCGGAC ATCACGGCCA 
CATCCTGGAC CTCTATGTAC GATGGAACGC 

25 

TCGACGATAC TATGCGCACT TTCCTTGACG 
GTACCGGCAC CGGTATGGTT CTATTCAATC 
GACTGGAACC TTCCCAATCC GCGGTTCAAT 
30 GGCTTGAGGG AAAGGCCCAA GTACATGTCG 

CTTTGAGCCC GGATCTGATC GTCATCAACT 
ACCTCGCCGA GGTGGTTGAG GCCCTGGTCC 
GAGACATGAG AACCTATGCC ACCCACAAAG 

35 

ACGGGAGCAA GGTGACGAGA TCTAAAGTGC 
AGGAGGAATT GCTTGTCGAC CCTGCCTTCT 
AAATAGAGCA TGTTGAGATC CTGCCGAAGA 

40 ACCGGTACGG CGCGGTTCTG CACATCCGTA 

ACAAGATCAA TGCAGAGTCC TGGATCGACT 
TTGCTAGGCT GTTGAAAGAG AACAAAGATG 

^5 ACAGCAAGAC TATCGTGGAA CGGCACATCG 

ATGATACACA TAGCTCAATC GATGGAGTCG 
GCCAGTGTCC ATCTCTTGAT GTGCATGACC 
GCGTCGAGGT CAGCTGGGCC CGCCAAAGGT 

50 

ATCACTTCCA GCCTACCGAG AACGAAAGCC 
AGGGCCAACA AGCCAGAAGC CTGACGAACC 

55 



TCCCACCCGC 


CCTGTTGAAG 


CAATGCCTCA 


6360 


ACCTCCTATA CGTAGGAGGA 


GACCGGTTAG 


6420 


TGGTCAAGTC 


AGAGGTCTAC 


AATGCCTACG 


64BQ 


TATACTCGAT 


TGCTGACACA 


GAACGATTTG 


6540 


GCAACTCTGG 


GGTCTACGTG 


ATGGACCAGA 


6600 


GAGAGCTGGT 


AGTCACTGGA 


GATGGTTTGG 


6660 


CCGACCGGTT 


CGTGGATGTC 


AT TGCT CG AG 


6720 


ACCGAGCTCG 


TTACCGGCCC 


AAGGATGGCC 


6780 


AGGTCAAGGT 


CCGAGGGCAC 


CGCATCGAGC 


6840 


GTGCCGGTGT 


GCACGATGCC 


GTTGTCGTTT 


6900 


AGATGGTGGC 


CTTCATCACC 


GCCCAAGACA 


6960 


ACCAAGTCCA 


GGAATGGGAG 


AGCCATTTCG 


7020 


TTGATCAAAA 


CACGCTCGGC 


GGAGACTTTA 


7080 


TTATTGACAA 


GAGGGAGATG 


CAGGAATGGC 


7140 


GTCAAGCAGC 


TGGCCACGTG 


CTTGAAATCG 


7200 


TCGGTCAAGC 


TGGGCTGAAG 


AGCTACATTG 


7260 


TCGTCAACAA GGCAGCCCAA ACGTTCCCAG 


7320 


GCACGGCGAT 


GGATACGGGC 


CGGCTCAGCG 


7380 


CCGTGGCCCA 


GTATTTCCCG 


AGCCGAGAAT 


7440 


GGATTCCAGG 


CGTTCGCCGT 


ATCTTCTTCG 


7500 


ACTTCCTTGT 


TGCACGGGCG 


GT CC AC AC AA 


7560 


AACAGGAGGT 


GGCCCGGTTA 


GAGGAACTGG 


7620 


TCACAAGTCT 


CAAGGAATCT 


CTATCGGAAG 


7680 


ACATGAAGGT 


GAACAACGAG 


CTCAGCTCAT 


7740 


ACCACAACCA 


GAATCAAAGC 


AGGTCGATTC 


7800 


TCGCCTCAAG 


CCAGATGGAT 


AGACAGGGTC 


7860 


CCGAAAGTAT 


CGCTGTGTTC 


AACATCCCTT 


7920 


CCAAGTCTTT 


GGCCGATGAC 


CACGACGGCG 


7980 


CCTGGATCTC 


AGCCGCGCGC 


GAGAAGGCGA 


8040 


TCGTGCAGTT 


GGCCGAGGAC 


GCTGGGTTCC 


8100 


CCCAGAACGG 


CGCTCTCGAT 


GTTTTCTTCC 


8160 


GCGCGCTCGT 


CGATTTCCCC 


ACCGACTACA 


8220 


GGCCCCTGCA 


GCGGGTTGAG 


AGCCGTCGAA 


8280 
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TCGAAGCACA GGTCCGAGAG CAGCTCCAAG 
5 GGATTGTGGT TCTCCAGAAC ATGCCGCTGA 

TCACCCTTCG AGCCAAGGTC ACCGCCGCAC 
GTGATTCTAT TGAAGCCATC ATCTGCAAGG 
10 GTATTACAGA CAACTTCTTT AATGTCGGAG 

CACGCCTGAG CCGACAACTC AATGCCCAGA 
TTATCGCCGA TCTGGCAGCC ACAATCCAGC 
CGACTTCTTA TACGGGTCCA GTCGAACAAT 

15 

ATCAACTGAA TGTCGGCGCC ACATGGTATC 
CTTTGGTTGT TTCTGCTCTC GCTGCCGCTC 
TGCGAACAAC CTTTATCGAA CAGGAAGGCA 

20 

CTAAGGAACT GAGGGTGATC GATGTCTCGG 
TGGAAAAGGA ACAGACAACA CCCTTCAATC 
T AC TGAAGAC AGGAGAGGAC GAACACATTC 

25 ATGGCTGGTC TGTCGATATC TTCCAACAAG 

GCGGACACGA TCCTTTGGCC CAGATCGCAC 
CTTGGCAGAG GCAGATATTC CAAGTCGCAG 

30 AACAGCTTGC CGATAATAAA CCAGCCGAGC 

TCTCCGGCCG CGCGGGCGAG ATCCCGGTGG 
AGGACTTCTG TCGAATCCGC CAGGTGACCG 
CAGCGCACTA TCGTATGACC GGGACTGAGG 

35 

GTAACCGGCC GGAGCTTGAG GGCTTGATCG 
TCACCGTCGA TGTAGAGGAT TCGTTCGAAA 
TGGCTGCACA TGCCAACCAG GATGTTCCTT 

40 

GATCGAGCGA CACTTCTCGG AATCCGCTGG 
AGAACCTTGG CAAGGTCCGC CTCGAGGGTA 
CCACGAGATT TGATATCGAG TTCCATCTGT 



TCGTCTATGC 


AGCTGATCTC 


TTCGTGCCCG 


AAGGCATCCT 


ACAGAAAGGC 


CTCGGCGAGC 


ATGGTGGGCT 


GGAGTCCCTC 


CGAAGCACAG 


CGTGCGATGC 


TTCAGTGGTG 


CAGATCTTCA 


TCGCGGTGAG 


AGATGAATCA 


ACACGGCTGA 


AAGTGGCTTG 


CTGGCTATCT 


CGGCGAGGTA 


CACCACGCTC 


GTGCGAGACA 


ATCGTGGCCA 



55 



TATTGCTCCC 


GGCATACATG 


ATCCCAGCCC 


8340 


ACACGAGCGG 


CAAGGTAGAT 


CGCAAGGAGC 


8400 


GTACGCCGAG 


CTCCGAACTC 


GTGGCTCCTC 


8460 


AATTCAAGGA 


TGTTCTCGGC 

m^ ^ a mv ^^^^^^ 


GTCGAAGTGG 


8520 


GACACTCTCT 


TTTGGCCACG 


AAGCTCGCAG 


8580 


TCGCAGTCAA 


AGACATCTTC 


GACCGGCCAG 


8640 


AGGATACCAC 


GGAGCACAAC 


CCTATCCTAC 


8700 

w/ ' w/ 


CGTTCGCCCA 


AGGCCGACTC 


TGGTTCCTCG 


8760 


TCATGCCCTT 

A \m+* &-X \^ A 


CGCAGTCCGG 


CTGCGAGGGC 


8820 

W7 W7 W 


TTCTGGCCCT 


AGAGGAGCGC 


CACGAGACAC 


8880 

%f WJ If 


TCGGCATGCA 


GGTCATCCAT 


CCGTTTGCCC 


8940 


GCGAGGAAGA 


GAGCACTATC 


CAGAAGATAC 


9000 


TCGCTTCCGA 


GCCCGGTTTC 


AGACTAGCAT 


9060 


TCTCGACAGT 


AATGCACCAT 


GCAATCTCTG 


9120 


AAATCGGCCA 


ATTCTACTCG 


GCAATCCTCC 


9180 


\*>\J\* XV<1 V>u/i J- 


CCAGTATCGC 


GATTTCGCGA 


9240 

7 A> w 


RGCLACC.GGCG 


GCAGCTTGCA 


TACTGGACTA 


9300 




T T Tf* AAC5PG A 

X X XUAnUvwO 


v-»v-»*jv^Vrf x nx vjv 


9360 




r T TGATCT AT 




9420 




GTTGCTGGCT 


GCTTTCCC3CG 


9480 


ATGCGACGAT 


TGGAACACCT 


ATCGCGAACC 


9540 


GCTTCTTCGT 


CAACACACAG 


TGCATGCGTA 


9600 


CGTTGGTTCA 


CCAGGTTCGA 


GAAACGACGC 


9660 


TCGAACAGAT 


TGTCTCAAAC 


ATCTTGCCCG 


9720 


TACAGCTCAT 


GTTTGCTCTA 


CATTCGCAGC 


9780 


TCGAGGAGGA 


GATCATCTCC 


ATTGCTGAGA 


9840 


ACCAAGAGGC 


TGAGAGGCTG 


AACGGTAGTA 


9900 


AGACTATACA 


GAGCGTCATC 


ACCATCTTCC 


9960 


CGGATATGCC 


CGTCGCCTCT 


ATGGCGCTTG 


10020 


GACTGCTGCA 


CCCTCAACAA 


ACT GAT TAT C 


10080 


AACAGCAGGT 


GGCAGTCAAC 


CCGGATGTCA 


10140 


GCTATGCCGA 


CTTGGATCGG 


AAGTCGGATC 


10200 


TCGCTCCTGA 


GACGTTCGTG 


GCGATCCTGG 


10260 


TCCTCGGTGT 


GTTGAAGGCC 


AACCTTGCAT 


10320 
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ATCTGCCTCT 


TGATGTCAAT 


GTTCCTGCGT 


CCCGGCTCGA 


GGCCATCCTT 


TCGGAGGTGT 


10380 


CGGGATCGAT 


GTTGGTCCTT 


GTGGGCGCAG 


AGACCCCGAT 


TCCGGAGGGG ATGGCTGAAG 


10440 


CGGAGACGAT 


CCGGATCACG 


GAGATTCTCG 


CCGACGCAAA 






10500 


TGGCCGCGAG 


TCAGCCCACT 


GCAGCAAGCC 


TTGCGTATGT 




i GGATt, Q» A 


10560 


CTGGTCGACC 


AAAGGGCGTC 


ATGGTCGAGC 


ATCGCGGAAT 




AL.AAAGCAGA 


10620 


CCAACATCAC 


ATCCAAGCTG 


CCAGAGTCTT 


TCCACATGGC 




AATC T TGCC T 


1068O 


TCGATGCCTC 


CGTGTGGGAA 


GTGTTCACGA 


CGCTTCTCAA 


XGGAGGCACG 


TTGGTGTGTA 


10740 


TCGACTATTT 


CACTCTCTTG 


GAGAGCACAG 


CGCTCGAGAA 


GGTCTTCTTC 


GACCAACGCG 


10800 


TCAATGTTGC 


TCTGCTCCCT 


CCAGCCTTGC 


TGAAACAGTG 


CCT TGACAAC 


TCACCCGCTC 


10860 


TGGTCAAAAC 


TCTCAGCGTT 


CTCTATATTG 


GTGGTGATAG 


GCT AG ATGC T 


TCTGATGCTG 


10920 


CCAAAGCAAG 


GGGGCTCGTC 


CAGACGCAAG 


CTTTCAATGC 


GTACGGCCCA 


ACGGAAAACA 


10980 


CAGTCATGAG 


CACAATCTAT 


CCCATTGCCG 


AAGACCCCTT 


LAI LAAriAiT 


GTGCCCATCG 


11040 


GTCATGCTGT 


CAGTAACTCG 


GGAGCTTTTG 


TCATGGACCA 


vjAA 1 LAbuAA 


ATCACCCCCC 


11100 


CTGGTGCAAT 


GGGAGAACTC 


ATCGTGACTG 


GAGACGGTCT 




TACACTACTT 


11160 


CCTCTCTCAA 


CACTGGTCGA 


TTTATCAACG 


TTGATATCGA 


1 (jvA, \3 A AA 


vjI L.AviGGUAT 


11220 


ACCGCACAGG 


AGATCGAGTG 


CGCTACCGAC 


CAAAAGACCT 


WUjA 1 LbAA 


Tit r rcGGCC 


11280 


GTATCGATCA 


CCAGGTCAAG 


ATCCGCGGCC 


ACCGCATCGA 




GTCGAGTATG 


11340 


CTCTTCTAAG 


CCACGACCTG 


GTCACTGATG 


CGGCAGTCGT 


w Ai^L-UAC TC T 


CAAGAAAATC 


11400 


AAGACCTGGA 


GATGGTTGGA 


TTCGTGGCCG 


CCCGAGTCGC 


lunlul 1 A\jA 


GACjGATGAGT 


11460 


CCAGCAACCA 


GGTCCAAGAA 


TGGCAGACTC 


ACT TCGACAG 


LAI IrvA^A 1AL 


GC AGAI ATL. A 


11520 


CCACAATCGA 


TCAGCAAAGC 


CTTGGACGGG ACTTCATGTC 




AltoiALuArti 


11580 


GCAGCCTGAT 


CAAGAAGAGC 


CAGATGCAGG 


AGTGGCTCGA 


T-^APAfVATY; 




11640 


TGGATTCCCA 


GCCCCCTGGT 


CACGTACTCG 


AAGTTGGTAC 






11700 


TCAACCTCGG 


CAGAGAAGGG 


GGTCTGCAAA 


GCTACGTTGG 


CCTARAfiTf ft 




11760 


C AACCGCGT T 


TGTCAACAAG 


GCCGCCAAGT 


CATTCCCTGG 


GCTTGAGGAT 


AGGATCCGGG 


11820 


TTG AAGT T GG 


AACAGCAACT 


GATATCGACC 


GGCTTGGAGA 


CGATCTGCAC 


GCAGGTCTTG 


11880 


TCGTCGTCAA 


CTCGGTCGCT 


CAATACTTCC 


CGAGTCAAGA 


CTATCTCGCC 


CAGTTGGTCA 


11940 


GAGATCTTAC 


CAAGGTCCCT 


GGCGTGGAGC 


GTATCTTCTT 


TGGTGATATG 


AGGTCGCACG 


12000 


CCATCAACAG 


GGATTTCCTT 


GTCGCTCGCG 


CAGTTCATGC 


ACTGGGCGAT 


AAGGCAACAA 


12060 


AGGCCGAGAT 


TCAACGGGAG 


GTTGTTCGAA 


TGGAAGAGTC 


TGAAGACGAA 


CTGCTCGTTG 


12120 


ATCCGGCCTT 


TTTCACCTCC 


CTGACGACGC 


AAGTAGAGAA 


TATCAAGCAC 


GTGGAGATTC 


12180 


TCCCCAAGAG 


AATGCGAGCC 


ACGAACGAGC 


TGAGCTCGTA 


TCGGTATGCT 


GCTGTTCTGC 


12240 


ACGTCAATGA 


TCTGGCGAAA 


CCGGCACACA 


AAGTCAGTCC 


TGGCGCCTGG < 


GTTGATTTTG 


12300 
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CCGCGACGAA GATGGATCGC GATGCCCTGA TCCGTCTGCT CAGGGGCACC AAAATTTCCG 
ACCACATTGC AATCGCCAAT ATTCCCAACA GCAAGACAAT CGTCGAGCGA ACCATCTGCG 
AATCGGTTTA CGACCTTGGC GGAGACGCCA AAGACTCGAA CGACAGAGTC TCATGGCTTT 



CGGCTGCTCG ATCGAATGCC 
TTGCACAAGA GGCAGGCTTC 



GTGAAAGTGG CTTCCCTCTC CGCGATCGAT CTCGTTGATA 
CGGGTCGAGA TCAGCTGCGC GCGGCAGTGG TCTCAGAATG 



GCGCGTTGGA CGCCGTATTC CACCACCTTG GCCCATCACC ACAGTCGTCT CATGTGTTGA 
TTGACTTCTT GACCGACCAC CAAGGT CGAC CAGAAGAAGC CCTGACGAAC CACCCGCTGC 
ACCGAGCACA GTCTCGACGC GTCGAGAGGC AGATCCGCGA GAGAC TCCAG ACTCTCCTGC 
CGGCCTACAT GATCCCGGCC CAGATCATGG TTCTTGACAA GCTACCTCTC AACGCGAATG 
GAAAGGTCGA CCGGAAGCAG TTGACGCAAC GGGCCCAGAC GGTACCCAAG GCCAAGCAAG 
TGTCTGCTCC TGTGGCCCCG CGCACAGAGA TCGAAAGGGT GCTCTGCCAG GAGTTCTCCG 
ACGTCCTAGG GGTTGATATC GGGATAATGG AAAACTTCTT CGATCTCGGT GGCCACTCGC 
TCATGGCAAC AAAGCTAGCC GCACGCATCA GCCGCCGACT AGAGACTCAC GTCTCCGTCA 
AGGAGATCTT TGACCATCCG CGAGTCTGCG ATCTTGTTCT CATAGTACAG CAGGGATCAG 
CGCCTCATGA CCCCATCGTT TCGACCAAAT ACACCGGGCC AGTGCCTCAG TCGTTTGCCC 
AGGGTCGTCT TTGGTTCCTC GACCAGCTCA ACTTTGGCGC AACATGGTAT CTCATGCCCC 
TTGCCGTCCG TCTTCGCGGT GCCATGAACG TTCATGCTCT TACCGCGGCC TTGTTGGCCC 
TCGAGAGGCG TCACGAGCTC CTCCGCACCA CGTTCTACGA ACAAAACGGC GTCGGTATGC 
AAAAGGTCAA TCCAGTTGTC ACCGAGACCC TGAGGATCAT TGATCTCTCC AACGGCGACG 
GCGACTATCT CCCGACATTG AAGAAGGAGC AAACTGCTCC GTTCCACCTG GAAACCGAGC 
CCGGATGGCG CGTGGCTCTA CTGCGCCTCG GGCCAGGCGA CTACATCTTA TCTGTCGTCA 
TGCATCACAT CATTTCCGAC GGCTGGTCTG TGGATGTTCT CTTCCAAGAG CTGGGCCAGT 
TCTATTCCAC GGCTGTCAAA GGCCACGATC CCCTATCGCA GACCACACCC CTCCCGATCC 
AT T ATCGCG A TTTTGCTCTG TGGCAGAAGA AGCCAACCCA AGAAAGCGAA CACGAGCGTC 
AGCTGCAATA CTGGGTCGAG CAACTTGTAG ATAGTGCCCC GGCCGAGCTA CTCACGGATC 
TGCCGCGGCC TTCGATCCTC TCTGGTCAGG CTGGGGAGAT GTCGGTCACG ATCGAGGGAG 

GAATTCTGCC GGGTCCATCG CGTTACCTCC TTCGTGGTAC 



CACTATACAA GAACTTGGAG 
TGCTTGCGGC CCTACGCGCA 
GGACACCAAT CGCCAATCGT 



GCCCATTATC GCCTCACAGG TTCCGAAGAC GCAACT AT AG 
AACCGACCTG AACTTGAGCA GATAATCGGC TTCTTCGTCA 
ATACGCAATG TATACGCATT ACCGTCAACG AGGACGAGAC CTTTGAGTCA CTAGTGCAGC 
AGGTCCGGTC AACGGCGACA GCTGCATTCG CCCATCAGGA CGTCCCGTTC GAGAAGATCG 



TCTCTACTCT TTTGCCCGGT 
TTGCGGTGCA TTCGCAGAAG 
TTGTTCCCAC GGAGATCACG 



TCTCGAGATG CATCCCGAAA CCCACTTGTG CAGCTCATGT 
AACCTCGGTG AGCTGAAGCT GGAAAACGCT CACAGCGAGG 
ACCCGGTTCG ATTTGGAATT CCACCTGTTC CAGCAAGATG 



12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 

14040 

14100 

14160 

14220 

14280 

14340 
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ACAAGCTTGA GGGCTCCATC CTCTATTCAA CCGATCTCTT CGAAGCAGTC TCGGTCCAAA 
GTCTTCTTTC AGTATTCCAG GAAATTCTGC GCCGGGGTTT GAACGGTCCG GACGTGCCCA 
TCAGCACCCT ACCACTTCAG GATGGAATCG TCGACCTACA AAGACAGGGC CTGTTGGATG 
TCCAGAAGAC GGAATATCCT CGTGATTCCT CTGTGGTTGA TGTGTTCCAT GAGCAGGTCT 
CGATCAACCC CGATTCCATT GCACTGATAC ATGGCTCGGA GAAGCTCAGC TACGCCCAGC 
TCGACAGGGA ATCTGACAGG GTCGCTCGCT GGCTCCGTCA CCGTTCTTTC AGCTCCGACA 
CGTTAATCGC GGTGTTGGCG CCACGGTCTT GCGAGACGAT CATCGCGTTC CTCGGAATCC 
TCAAGGCAAA CCTTGCGTAC CTACCTCTGG ATGTCAAGGC TCCTGCTGCC CGCATTGATG 
CCATTGTATC GTCGCTACCC GGGAACAAGC TCATCCTGCT GGGCGCAAAC GTTACGCCGC 
CCAAGCTTCA GGAAGCGGCC ATCGATTTCG TGCCCATCCG TGATACCTTC AC T AC AC TC A 
CTGACGGCAC ACTTCAAGAT GGGCCTACCA TCGAGCGACC CTCTGCGCAA AGCCTAGCGT 
ACGCCATGTT CACGTCTGGT TCTACCGGAC GACCGAAGGG TGTTATGGTC CAGCACCGCA 
ACATCGTCCG CTTGGTGAAG AACAGTAACG TCGTCGCTAA GCAGCCCGCA GCAGCTCGCA 
TAGCACATAT ATCCAATTTG GCGTTTGACG CCTCGTCTTG GGAGATCTAT GCCCCGCTGC 
TCAACGGCGG CGCAATTGTG TGTGCCGACT ACTTCACAAC GATTGATCCA CAGGCTCTTC 
AAGAAACCTT CCAGGAACAC GAGATCCGCG GTGCTATGCT GCCGCCCTCG CTCCTCAAGC 
AGTGCCTGGT TCAGGCCCCA GACATGATCA GCAGGCTTGA CATCTTATTT GCTGCTGGTG 
ATCGCTTCAG TAGCGTGGAT GCTCTCCAGG CCCAACGTCT CGTTGGCTCG GGCGTCTTCA 
ATGCGTATGG CCCTACGGAG AATACGATTC TGAGCACTAT CTATAACGTT GCTGAAAACG 
ACTCCTTCGT TAACGGCGTT CCCATAGGCA GTGCTGTGAG CAACTCCGGA GCCTACATCA 
TGGATAAGAA CCAGCAGCTC GTGCCAGCTG GAGTTATGGG AGAACTGGTT GTTACTGGTG 
ACGGTCTCGC CCGCGGCTAT ATGGATCCAA AGCTAGATGC AGACCGCTTT ATCCAAC TGA 
CAGTCAACGG CAGCGAGCAA GTCAGGGCAT ATCGCACCGG CGACCGTGTG CGATACCGAC 
CAAAGGACTT CCAGATCGAG TTCTTCGGTC GTATGGACCA GCAAATCAAG ATCCGCGGCC 
ACCGTATCGA GCCGGCCGAG GTAGAGCAGG CCTTCCTGAA TGATGGCTTC GTCGAGGACG 
TTGCTATCGT TATTCGGACC CCAGAGAACC AAGAGCCTGA GATGGTCGCC TTTGTTACTG 
CTAAGGGCGA CAACTCCGCG AGAGAAGAAG AGGCTACAAC CCAGATCGAA GGTTGGGAGG 
CGCATTTCGA GGGTGGTGCG TACGCCAACA TCGAGGAGAT CGAAAGCGAG GCGCTTGGTT 
ACGACTTTAT GGGCTGGACG TCTATGTACG ATGGCACTGA GATCGACAAG GACGAGATGA 
GAGAGTGGCT GAATGACACG ATGCGCTCTC TCCTCGATGG AAAGCCGGCT GGTCGAGTTC 
TTGAGGTCGG TACCGGTACC GGTATGATCA TGTTTAACCT TGGCAGGTCA CAAGGGCTCG 
AAAGGTATAT TGGCCTCGAA CCTGCACCGT CGGCAGCCGA GTTCGTCAAC AACGCTGCAA 
AGTCATTCCC GGGTCTCGCG GGCAGGGCCG AAGTTCACGT CGGCACCGCC GCAGATGTCG 



14400 

14460 

14520 

14580 

14640 

14700 

14760 

14820 

14880 

14940 

15000 

15060 

15120 

15180 

15240 

15300 

15360 

15420 

15480 

15540 

15600 

15660 

15720 

15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 
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GTACCCTGCA AGGCCTGACC TCAGATATGG 


CCGTCATCAA CTCGGTGGCG 


CAGTACTTCC 


16380 


5 


CAACGCCAGA GTACCTGGCC GAGACGATCA 


AATCACTTGT CCAAGTCCCG 


GGCATGAAGC 


16440 




GCATATACCT CGGCGATATG 


CGGTCCTGGG 


CCATGAACAG GGACTTTGCT 


GCCGCTCGCG 


16500 




CCGCCTATTC ACTGGCCGAT 


AATGCCAGCA 


AAGACCGCGT GAGACAGAAG 


AT GATGG AAT 


16560 


10 


TGGAGGAGAA GGAGGAAGAA 


TTACTCGTTG 


ACCCGGCCTT CTTCACTGCC 


TTGGCGAGCC 


16620 




AGCTGCAAGA CAGGATCCAA 


CACGTGGAGA TCCTACCCAA GCGAATGAAG GCTACAAATG 


16680 




AGCTAAGCTC GTACCGATAT 


GCCGCCGTGC 


TGCACATCTC CGACGAGCCC 


CTTCCTATCT 


16740 


15 


ACAAGATTGA TCCCGAAGCT 


TGGATCAACT 


TTGAGGGGTC TCGATTGACC 


CGAGAGGCGC 


16800 


TTGCACAAGT ACTCAAGGAG AATGAGAACG CCGAGAGTGT GGCCATCAGC 


AACATTCCTT 


16860 




ATAGCAAGAC CGTTGTAGAA 


CGTCACATTG 


TGCGGTCGCT TGACCAGGAA 


GACGCCAATG 


16920 




CCCCTGAGGA AT CG ATGGAT 


GGCAGCGACT 


GGATCTCGGC CGTGCGCACA 


AGAGCTCAGC 


16980 


20 


AGTGCCACAC TCTCTCCGCA 


AGTGACCTGT 


TCGACATTGC AGAAGATGCT 


GGGTTCCGTG 


17040 




TTGAGGTCAG CTGGGCCCGT 


CAACATTCGC 


AGCACGGTGC CCTGGATGCC 


GTGTTCCACC 


17100 




ACCTGAAGCC CGCTACGGAG 


GACAGTCGCG 


TTTTGATCAA GTTCCCTACA 


GATCACCAAG 


17160 


25 


GCCGGCCGCT CAAGAGCTTG 


ACGAATCAAC 


CGCTCCTGCC AGCCCAGAGT 


CGCCGAGCCG 


17220 




AGCTCTTGAT CCGCGAGGGG 


CTGCAAACCC 


TGCTGCCTCC CTACATGATC 


CCCTCGCAAA 


17280 




TCACGCTTAT CGACCGGATG 


CCACTCAATG 


CTAACGGCAA AGTCGACCGG 


AGAGAACTCG 


17340 


30 


CCCGTCGGGC CAAAATCACA 


CAGAAGAGCA 


AGCCGGT tga ggacatcgtt 


CCCCCTCGGA 


17400 


ACAGCGTCGA GGCTACGGTC 


TGCAAGGGCT 


TCACTGACGT GCTGGGCGTT 


GAAGTTGGGA 


17460 




TAACAGACAA TTTCTTCAAT 


CTGGGCGGTC 


ATTCACTGAT GGCAACGAAG 


TTGGCGGCGC 


17520 




GTCTAGGCCG TCAACTCAAT 


ACACGTATCT CGGTGAGAGA CGTCTTCGAT 


CAACCAGTCG 


17580 


35 


TTGCTGATCT CGCCGCTGTG ATCCAACGCA ACTCGGCACC TCACGAGCCG 


ATCAAGCCAG 


17640 




CAGACTACAC AGGGCCCGTT 


CCCCAGTCAT 


TCGCACAGGG CCGCCTTTGG 


TTCCTCGATC 


17700 




AACTTAATGT CGGGGCGACG 


TGGTATCTTA 


TGCCCCTTGG TATCCGCCTT 


CACGGATCTC 


17760 


An 


TCCGGGTTGA TGCCCTCGCT 


ACCGCGATAT 


CAGCCCTGGA GCAACGTCAC 


GAGCCTCTCC 


17820 




GTACGACATT CCACGAGGAA 


GATGGCGTCG 


GTGTTCAAGT CGTGCAAGAC 


CACCGGCCCA 


17880 




AGGATCTGAG AATCATCGAC 


CTGTCCACTC 


AGCCAAAGGA CGCCTATCTT 


GCCGTGTTGA 


17940 


45 


AGCATGAGCA GACCACGCTC 


TTCGACCTCG 


CAACCGAGCC CGGTTGGCGC 


GTCGCTCTGA 




TCAGGCTTGG AGAAGAAGAG 


CATATACTTT 


CCATTGTTAT GCACCATATC 


ATATCTGACG 


18060 




GCTGGTCAGT AGAGGTTCTG 


TTTGATGAGA TGCACCGGTT CTACTCGAGT GCGCTTAGGC 


18120 




AACAGGATCC TATGGAGCAA 


ATCTTGCCTC 


TACCGATCCA GTACCGCGAC 


TTTGCAGCAT 


18180 


50 


GGCAAAAGAC TGAAGAGCAG 


GTTGCCGAGC 


ATCAGCGGCA GTTGGACTAC 


TGGACGGAGC 


18240 




ACCTTGCCGA CAGTACCCCT 


GCGGAGCTGT 


TAACTGACCT CCCTCGACCT 


TCTATCTTGT 


18300 




CCGGCCGCGC CAATGAGCTA CCCCTTACCA TCGAGGGGCG TCTTCATGAT 


AAATTGCGCG 


18360 
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CTTTCTGCCG 


AGTTCACCAA 


GCCACGCCGT TCGTCATTCT CCTTGCGGCC 


CTGAGGGCAG 


18420 


5 


CACATTACCG 


TCTTACAGGG 


GCTGAAGATG CCACGCTTGG AACGCCAATT 


GCTAACCGTA 


18480 




ACCGACCGGA ACTTGAGAAC 


ATGATCGGCT TCTTTGTTAA CACGCAGTGT 


ATGCGTATTG 


18540 




L1AI 1 (jAWjA 


fZAATY5ATAAr 


TTCGAGTCCC TCGTCCGCCG AGTACGATCG 


ACTGCAACAT 


18600 


10 


CCGCTTTCGC 




GTTCCATTTG AATCCATTGT CTCGTCTCTT 


CTGCCTGGAT 


18660 




CCAGGGACGC 


CTCGCGCAA1 


CCATTAGTGC AGGTCATTCT GGCTGTTCAC 


TCTCAGCAGG 


18720 




ATTTGGGTAA 




GAAGGCCTCA GAGATGAAGC TGTTGACTCG 


GCTATCTCAA 


18780 


15 






CATCTTTTCG AGCACGCAGA CAGGCTCAGC GGTAGCGTGC 


18840 




rnntm Tlf'nf* IV & & 
i I 1 ALuLiiftn 




AAGCTGCGCA CGATCGAATC AGTGGTCTCC 


GTTTTCCTGG 


18900 






GCGGGCTCTC 


GACCAGCCAC TGACTCCTCT CGCTGTCTTG 


CCGCTGACTG 


18960 


20 




GGAGATCGCA 


AGCAAGGGGC TCCTTGATGT GCCCAGGACA 


GACTATCCAC 


19020 


GAGATGCAAA 


TATCGTTGAG 


GTCTTCCAAC AGCACGTTCG CGCTACCCCG 


GACGCCATCG 


19080 




CCGTCAAGGA 


CGCTACTTCC 


ATACTGACGT ATGCTCAGCT AGATCAGCAG 


TCTGATCGAC 


19140 




TTGCTATCTG 


GTTGAGTCGC 


CGGCACATGA TGCCCGAAAC GCTGGTGGGT 


GTCCTTGCGC 


19200 


25 


CGCGGTCATG 


CGAGACCATT 


ATCGCAATGT TTGGCATTAT GAAGGCCAAC 


CTCGCCTACT 


19260 




TGCCTTTGGA 


TATAAACTCG 


CCTGCTGCTC GACTCCGCAG CATTCTCTCA 


GCCGTAGATG 


19320 




GGAACAAGCT 


TGTTTTGCTC 


GGCAGTGGTG TCACAGCCCC CGAGCAAGAG AACCCCGAGG 


19380 


30 


TGGAAGCTGT 


TGGTATTCAA 


GAGATCTTGG CCGGCACTGG ACTGGACAAG 


ACACAAGGCA 


19440 




GCAACGCCCG 


ACCCTCGGCA ACGAGCCTTG CTTATGTTAT CTTCACCTCT 


GGTTCAACCG 


19500 




GCAAGCCCAA 


GGGCGTCATG 


GTCGAACATC GTAGCGTTAC GAGATTGGCA 


AAGCCCAGCA 


19560 


35 


ACGTTATCTC CAAGCTACCA CAAGGAGCCA GGGTGGCGCA CCTCGCCAAC 


ATTGCCTTCG 


19620 




ATGCCTCGAT 


CTGGGAAATT 


GCCACAACTC TTCTGAATGG AGCCACGCTT 


GTTTGTCTCG 


19680 




ACTATCACAC 


CGTTCTCGAC 


TGCAGGACTC TCAAAGAAGT CTTCGAAAGG 


GAAAGCATTA 


19740 


40 


CGGTTGTCAC 


ACTGATGCCT 


GCGCTCCTCA AGCAGTGCGT GGCCGAAATA 


CCCGAGACCC 


19800 


TCGCACACCT 


CGACCTCCTG 


TACACCGGTG GAGATCGAGT GGGTGGTCAC 


GATGCTATGC 


19860 




GGGCTCGCTC 


GCTAGTCAAG 


ATCGGCATGT TCAGCGGTTA CGGCCCTACG 


GAGAACACCG 


19920 




TCATCAGCAC 


CATCTACGAA 


GTTGATGCAG ACGAGATGTT TGTGAATGGT 


GTGCCTATCG 


1 y you 


45 


GCAAGACTGT 


AAGCAACTCT 


GGGGCATATG TTATGGACAG GAATCAGCAG 


CTGGTGCCTA 


20040 




GTGGCGTGGT 


AGGTGAGCTT 


GTGGTCACTG GCGATGGCCT TGCTCGCGGA 


TACACTGATC 


20100 




CATCCCTAAA 


CAAGAACCGC 


TTCATTTACA TCACTGTCAA TGGAGAGAGT ATCAGGGCAT 


20160 


50 


ATCGGACTGG 


CGATCGGGTG 


AGGTACCGGC CTCATGATCT GCAGATTGAA 


TTCTTTGGCC 


20220 




GCATGGACCA GCAGGTCAAG ATCCGTGGCC ATCGAATCGA GCCGGGAGAG 


GTGGAGAGCG 


20280 




CATTGCTCAG 


TCACAACTCG 


GTACAAGACG CCGCGGTCGT CATTTGCGCG 


CCAGCAGATC 


20340 
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AAGACTCAGG 
CCCAGGAGGA 
CATATTCAGA 
CTTCTATGTA 
CCATGCGTAT 
CCGGCATGGT 
AGCCTTCAAA 
AGGATGGCCG 
TTCAACCACG 
TCTTCAGGGT 
ACATGAGAAC 
GCGAGAAGGC 
AGGAACTTCT 
TCAAGCACGT 
GATATGCCGC 
CTCCCAACGC 
TGCTGAAGGA 
CCATTGTTGA 
ACTCACTGGA 
CACTCGATGC 
GTTGGGCGCG 
CGCCCAAGGA 
TGAACACCTT 
TCCGCGAGAA 
TTGATCAGAT 
CTATCGTGGC 
AGGCTATTCT 
ACTTCTTTGA 
GCCGCCTTGA 
TTGCGGCGTC 



CCGGGCCGGC 
TTGGCGCGAC 
CTGCGTTGTC 
TTGAAGAAAG 



CGCGGAAATG 
AGAAGCAGTC 
AGTCAAGGAC 
TGACGGCAGC 
GATACTCGAC 
CATGTTCAAC 
GTCGGCAGCC 
GTCAATAGTC 
CCTCGTCGTT 
TGTGGAGGCC 
CAACGCCATC 
AAACAAGCGC 
GACGGACCCT 
CGAAATTCTC 
AGTACTACAT 
CTGGATAGAC 
GCACAAGGAT 
GCGGTTTGTC 
CGGATCAGCT 
AATGGATGTC 
TCAATGGTCC 
GGGTGCTCGC 
AACGAACCGT 
GCTGCAGACC 
GCCTGTCAAC 
CCCGAAGCCA 
GAGAGACGAA 
TCTCGGCGGG 
TGCCCATATT 
CATCCAGAGA 
TGAACAGTCA 
CTGGTACCTG 
TGCCGCACTC 
CGACGGCGTT 



GTGGCATTCG 
GATCAGGTTC 
ATTCGACAGT 
GAGATCGACA 
GCCAGAGAGC 
CTTGCCAAGT 
CAATTCGTCA 
CATGTGGGCA 
ATCAACTCAG 
CTTGTACAGA 
AACAGAGACT 
CTGGTCCGCC 
GCATTCTTTA 
CCCAAGACCA 
GTGCGTGGCT 
TTTGCGGCAG 
GCCGGGACCG 
AACAAGTCAC 
TGGGTTGCAG 
AAGG AG AT TG 
CAGAATGGTG 
ACACTTATTG 
CCCCTGAACA 
CTCCTGCCGC 
AACAACGGCA 
AGGTCAGCGG 
TTCGAGGACG 
CACTCACTTA 
TCCATCAAAG 
GAATCGGCTC 
TTTGCCCAAG 
ATGCCTTTAG 
TTCGCCTTGG 
GGCGTGCAAA 



TTGCCGCCCG 
AAGGGTGGGA 
CAGAAGTCGG 
AGACAGATAT 
CGGGCCACGT 



GTCCTGGTCT 
ATGATGCAGC 
CGGCGACAGA 
TAGCGCAGTA 
TCCCAAGCGT 
TCGTCGCAAG 
AGATGATCTA 
CATCTTTGCG 
TGAAGGCTAC 
CGAGAGAACA 
ACGGTCTCGA 
TCGCTATCGG 
TGAGCGAGGA 
CCGTCCGGAT 
CTCAGGAGGC 
CGCTCGATGC 
AGTTCCCGAC 
GCATTCAAAG 
CTTACATGAT 
AGATTGACCG 
CTACTCGGGT 
TGCTCGGAAC 



GAATACCGAA 
GACGCACTTC 
TAACGACTTC 
GCACGAGTGG 
ACTGGAGATC 
GCAGGGCTAC 
CCAGTCATTC 
CATCAACAAG 
TTTCCCCACG 
GGAACGCATC 
CCGAGCATTG 
TGAGCTCGAA 
TACGCGCTTG 
CAACGAGCTC 
ATCAACTATA 
CCGGCAGACC 
TAATATCCCG 
TGATATGGAG 
GGCCGCTCAA 
GGGATACCAG 
CATCTTCCAT 
GGATTACGAA 
CCGCCGTCTT 
CCCATCGCGC 
CAAGGAGCTT 
AGCCCCCCGC 
AGAAGTCAGC 



TGGCCACGAA 
ATGTCTTTGA 
CTCATGAACC 
GTCGCCTATG 
CCATCCGTAT 
AGAGACGACA 
TTGTTGGAGA 



GCTCGCCGCC 
TCAGCCGGTG 



GATTCCGCAA 
GTTCCTTGAC 
CCGTGGCCAG 
TGAGACCTTG 
GGCTCGCAAC 



GACGAAGACA 
GAAACGGCCG 
ATGGGCTGGA 
CTCAACGACA 
GGTACCGGCA 
GTCGGTTTCG 
CCGGCTCTGA 
GCTGGGCCGA 
CCAGAGTACC 
GTCTTTGGTG 
CACACCCTCG 
GCCAACGAAG 
GGTGAGAAGA 
AGCAAGTACC 
CACCAAGTCT 
CTCATCAACT 
TACAGCAAGA 
GAAGGCCAGA 
AGCTGCCCAT 
GTCGAAGTCA 
CACTTCGAAC 
GGCCGGAATG 
GGGACGCAGA 
ATCATGGTCC 
GTGCGGAGAG 
AATGAGATCG 
GTGCTGGATA 
CGCGTTAGCC 
CTGGCGGATC 
AGGCCTTACA 
CAGCTTAACC 
TTGAGGGTAG 
AGAACCACCT 
TCAGACCTTC 



20400 

20460 

20520 

20580 
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20700 
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20880 

2094O 
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21060 
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GGGTTCATGA CGTTTCTACC GGAGACGACG 
5 AGACTGTGCC CTTCGACCTC TCGTCAGAGC 

GCGAAGAGGA TCATGTACTA TCCATCGTCA 
TGGACATTCT CCGCGGAGAG TTGGGTCAAT 
CCTTGCTGCA CGCCAACCCC CTTCCTATTC 

10 

AGGCAAAACA GGTCGAAGAG CAT C AACG TC 
ACAGCACTCC AGCTGAGCTC TTGACGGATC 
CCGGGTCGGT GGATGTCACG ATCGAAGGCT 

15 GCACGCGTTC GGTAACCACA TTCGTTGTGC 

GTCTCACTGC CGTCGATGAC GCGACTATCG 
AGCTGGAGAC GTTGGTTGGC TGCTTTGTAA 

20 ACGACGATAA CTTTGAAGGT CTTGTGCGAC 

CGAACCAAGA TGTTCCTTTC GAACGAATCG 
CATCCCGCAA CCCCCTGGTT CAGCTCATGT 
AGGTCCGACT CGAGGGCTTG GAGAGTGTCA 

25 

ATATGGAATT CCACCTCGTC CCCGGCGATC 
CAGACCTTTT TGAGCAAGGC ACTATCCAGA 
GCTCCGTCCT GGACCAGCCA TTGACCCCGA 

30 

CAAACCTCGA GAGCT TGG AT CTCCTGGAGA 
CAGTCGTTGA TCTCTTCCGA GAGCAAGCGG 
ACTCATCGTC GCAACTGACA TATGCTCAAC 

35 GGCTGCACGA GCGCCACATG CCGGCGGAGT 

GCGAGACTAT CATCGCGTAC TTTGGCATCA 
ATGTTTATGC GCCAGATGCC CGTCTGGCGG 

^ TGCTTCTGTT GGGCGCAGGT GTCCCTCAGC 

CATACATCGC GGAAGCACTG AGCC AT GCC A 
CCTCGGCCAC CAGCCTTGCG TACGTCATTT 
GTGTCATGAT CGAGCATCGC GGCATCGTGC 

45 

TCCCGGAATC GGGATCAGCT TTGCCTGTCT 
CGACTTGGGA GATCTACACT GCCGTGCTCA 
ACACCATGCT GGACATAGCC GCGTTGAACT 
50 CCTTCTTCAC CCCTGCCTTC CTGAAGCAAT 

ACCTAGAGAT CCTTCACACG GCAGGCGATC 

55 



GGGAGTACCT 


TGAGGTACTC 


AGGAGGGAAC 


22440 


CTGGCTGGAG 


GGTTTGCCTG 


GTCAAAACGG 


22500 


TGCACCATAT 


TATTTACGAC 


GGCTGGTCCG 


22560 


TCTATTCCGC 


TGCCCTACGC 


GGCCAGGACC 


22620 


AATACCGGGA 


TTTCGCAGCG 


TGGCAAAGGG 


22680 


AACTTGGGTA 


CTGGTCGAAA 


CAGCTCGTTG 


22740 


TGCCTCGCCC 


GTCTATCTTG 


TCCGGTCGTG 


22800 


CTGTTTACGG 


AGCCCTTCAG 


TCATTCTGCC 


22860 


TTCTGACTGT 


GTTCCGGATT 


GCGCATTTCC 


22920 


GCACGCCTAT 


CGCAAACCGT 


AACCGTCCTG 


22980 


ACACGCAATG 


TATGCGTATC 


AGCATAGCCG 


23040 


AGGTGCGTAA 


TGTTGCAACG 


GCAGCTTACG 


23100 


TGTCCGCCCT 


AGTTCCAGGG 


TCGAGAAACA 


23160 


TTGCTGTCCA GTCCGTGGAA 


GATTATGACC 


23220 


TGATGCCTGG 


AGAAGCCTCC 


ACACGCTTTG 


23280 


AGAAGCTTAC 


GGGCAGCGTT 


CTTTACTCCT 


23340 


ACTTCGTCGA 


CATCTTCCAA 


GAATGTCTTC 


23400 


TCTCCGTTCT 


TCCCTTCAGC 


AACGCCATTT 


23460 


TGCCGACCTC 


AGACTACCCC 


CGCGATCGGA 


23520 


CAATCTGCCC 


CGACAGCATC 


GCCGTCAAAG 


23580 


TGGATGAGCA 


ATCCGACCGT 


GTTGCCGCCT 


23640 


CTTTGGTCGG 


TGTACTGTCG 


CCACGGTCGT 


23700 


TGAAGGCAAA 


CCTGGCTTAC 


CTGCCGTTGG 


23760 


CTATCCTGGA 


TACAGTCGAA 


GGCGAAAGAC 


23820 


CCGGCATCCA 


GATCCCTCGC 


CTGTCAACAG 


23880 


CGACCGTCGA 


TGTCACTTCC 


ATCCCACAGC 


23940 


TCACTTCGGG 


ATCTACTGGC 


AAGCCCAAGG 


24000 


GCCTGGTTAG 


AGATACCAAC 


GTCAACGTGT 


24060 


CTCACTTCTC 


CAACCTCGCC 


TGGGATGCGG 


24120 


ATGGAGGGAC 


CGTTGTGTGC 


ATTGACCGAG 


24180 


CAACATTCCG 


GAAGGAGAAC 


GTTCGGGCTG 


24240 


GCCTTGCCGA 


GACGCCAGAG 


CTGGTCGCCA 


24300 


GTCTCGATCC 


TGG AG AT GCC 


AACCTGGCTG 


24360 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



GAAAGACAGC CAAGGGTGGT ATCTTCAACG TCCTGGGTCA CACAGAGAAC ACTGCCTATA 
GTACCTTCTA CCCTGTGGTT GGTGAGGAGA CGTTCGTCAA TGGTGTCCCC GTCGGTCGCG 
GCATCAGCAA CTCCCATGCA TATATCATCG ACCGACACCA GAAGCTCGTA CCCGCAGGTG 
TCATGGGAGA GCTTATTCTC ACTGGCGACG GTGTTGCGCG AGGTTACACC GACTCTGCGC 
TGAACAAGGA TCGATTCGTT TACATCGATA TCAACGGCAA AAGCACATGG TCGTACCGCA 
CAGGCGATAA GGCACGTTAT CGACCAAGGG ACGGCCAGCT GGAATTCTTT GGCCGCATGG 
ACCAAATGGT CAAGATCCGT GGTGTTCGAA TCGAACCCGG CGAAGTTGAG CTCACCCTGC 
TCGACCATAA GTCCGTCCTG GCCGCGACTG TGGTGGTCAG AAGACCACCC AATGGCGACC 
CGGAGATGAT TGCCTTCATC ACCATCGACG CTGAAGACGA CGTGCAAACT CACAAGGCCA 
TTTACAAGCA CCTCCAGGGT ATCTTGCCCG CGTACATGAT TCCCTCACAC CTTGTCATCC 
TTGACCAGAT GCCGGTCACC GACAACGGTA AGGTCGATCG CAAGGATCTC GCACTCAGAG 
CGCAGACAGT ACAGAAACGC AGGTCTACCG CTGCAAGGGT ACCACCTCGT GACGAGGTGG 
AGGCTGTTCT TTGCGAAGAG TACAGCAACT TACTTGAAGT TGAGGTTGGC ATTACCGACG 
GATTCTTCGA CCTGGGTGGA CATTCGCTCC TCGCCACCAA GCTTGCGGCC CGCCTAAGCC 



GACAACTCAA CACTCGCGTG TCTGTCAAGG 
TCGCTGATAT CATCCGCCGC GGTTCCCATC 



ACGTCTTTGA CCAGCCAATA CTCGCTGACC 
GCCACGATCC GATTCCTGCC ACTCCATACA 
CGGGCCCTGT CGAACAGTCG TTCGCTCAGG GCCGCCTGTG GTTCTTGGAA CAAC TGAACC 
TAGGTGCCAG CTGGTACTTG ATGCCCTTCG CGATCCGGAT GCGTGGGCCC CTCCAGACAA 
AGGCGCTGGC TGTCGCACTG AACGCCTTGG TGCACCGGCA CGAGGCGTTG CGGACGACTT 
TCGAGGACCA CGATGGGGTT GGTGTTCAGG TCATTCAACC AAAGTCAAGC CAAGACCTGC 
GGATCATCGA CCTATCAGAC GCTGTAGATG ATACTGCCTA TCTCGCCGCG CTCAAGAGGG 
AACAGACAAC AGCCTTCGAC CTGACCTCTG AACCAGGGTG GAGAGTGTCA CTCTTACGCC 
TAGGTGACGA TGATTACATC CTTTCTATCG TTATGCACCA CATTATCTCT GATGGCTGGA 
CTGTTGATGT GCTACGACAA GAACTCGGCC 
AGCCTTTATC GCAGGCCAAG TCCCTCCCTA 
GGCAGGAGAA CCAGATCAAG GAGCAAGCGA 
CAGATAGCAC CCCCTGCGAG TTCCTAACGG 
AAGCTGACGC CGTTCCTATG GTGATTGATG 
GCCGGACGCA CCAAGTCACA TCGTTCTCAG 



ACCGCCTTAC CGGGACACTC 
CAGAGTTGGA AGGTCTGATC 



GACGCGACGG 
GGTTTCTTCG 



AGTTCTATTC AGCTGCGATC AGGGGTCAGG 
TTCAATACCG CGACTTTGCT GTTTGGCAGA 
AGCAGCTCAA GTATTGGTCA CAGCAGCTCG 
ACCTCCCTCG GCCCTCTATC CTGTCTGGTG 
GCACGGTGTA TCAGCTCCTT ACTGATTTCT 
TCCTGCTCGC AGCCTTCCGC ACTGCCCACT 
TTGGCACACC AATCGCTAAC CGGAACCGGC 
TTAACACGCA GTGTATGAGG ATGGCAATCA 



GTGAGACTGA AACCTTTGAG TCACTAGTCC AGCAGGTTCG CTTGACTACG ACAGAAGCCT 
TTGCGAACCA AGATGTGCCG 



ATACGTCAAG GAACCCGCTT 



TTTGAGCAGA TTGTGTCAAC CCTTCTTCCT GGGTCACGAG 
GTGCAGGTCA TGTTTGCCCT GCAATCACAG CAAGACCTCG 



24420 

24480 

24540 

24600 

24660 

24720 

24780 

24840 

24900 

24960 

25020 

25080 

25140 

25200 

25260 

25320 

25380 

25440 

25500 

25560 

25620 

25680 

25740 

25800 

25860 

25920 

25980 

26040 

26100 

26160 

26220 

26280 

26340 

26400 
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GAAGAATCCA 


GCTGGAAGGT 


ATGACGGACG 


AAGCTCTGGA 


AACGCCGCTG 


TCGACGAGAC 


26460 


5 


TCGACCTTGA 


GGTTCACCTC 


TTCCAGGAGG 


TTGGAAAGCT 


GAGCGGCAGC 


CTCTTGTACT 


26520 




CCACGGACCT 


CTTCGAGGTC 


GAGACGATTC 


GTGGAATCGT 


TGATGTGTTC 


CTGGAGATCT 


26580 




TGCGCCGCGG 


CCTTGAGCAA 


CCCAAGCAGC 


GACTGATGGC 


CATGCCAATT 


ACCGATGGCA 


26640 


10 


TCACAAAGCT 


ACGCGACCAG 


GGTCTCCTAA 


CAGTGGCGAA 


ACCAGCCTAC 


CCTCGCGAAT 


26700 




CGAGTGTCAT 


AG ATC TGTTC 


AGACAGCAGG 


TTGCCGCCGC 


ACCGGATGCC 


ATCGCTGTGT 


26760 




GGGATTCCTC 


CTCAACATTG 


ACCTATGCCG 


ACCTCGATGG 


GCAATCGAAC 


AAGCTCGCCC 


26820 


15 


ACTGGCTGTG 


CCAGCGCAAT 


ATGGCCCCAG 


AGACCTTGGT 


AGCTGTATTC 


GCGCCACGCT 


26880 




CATGCCTCAC 


CATCGTCGCA 


TTCCTCGGTG 


TTTTGAAGGC 


TAATCTGGCC 


TACCTGCCCT 


26940 




TGGATGTCAA 


TGCGCCTGCT 


GCTCGTATCG 


AGGCTATCCT 


GTCAGCAGTA 


CCAGGCCACA 


27000 


20 


AGCTGGTCCT 


GGTGCAGGCT 


CATGGGCCCG 


AGCTTGGCCT 


GACGATGGCT 


GATACTGAAC 


27060 


TGGTGCAGAT 


CGACGAGGCA 


CTTGCATCCA 


GTTCATCCGG 


TGACCATGAG 


CAGATCCATG 


27120 




CGTCCGGCCC 


TACTGCCACA 


AGTCTTGCCT 


ACGTGATGTT 


TACGTCAGGG 


TCTACTGGGA 


27180 




AACCAAAGGG 


TGTCATGATC 


GACCACCGCA 


GCATCATTCG 


ACTTGTCAAG 


AACAGCGATG 


27240 


25 


TTGTTGCCAC 


TCTGCCTACG 


CCAGTCCGGA 


TGGCGAATGT 


ATCAAACCTT 


GCCTTCGACA 


27300 




TCTCGGTGCA 


AGAAAT CT AC 


ACGGCGCTCC 


TAAACGGTGG 


CACTCTGGTC 


TGCTTGGACT 


27360 




ATCTGACGCT 


ATTGGACAGC 


AAAATTCTTT 


ATAACGTTTT 


TGTGGAAGCA 


CAGGTCAACG 


27420 


30 


CCGCCATGTT 


CACGCCGGTT 


CTCCTCAAGC 


AATGTCTTGG 


AAACATGCCC 


GCCATCATCA 


27480 




GTCGCCTGAG 


TGTTCTCTTT 


AACGTTGGTG 


ACAGGCTGGA 


TGCCCACGAT 


GCTGTGGCTG 


27540 




CATCAGGCCT 


GATCCAAGAC 


GCCGTATACA 


ACGCCTACGG 


TCCCACGGAG 


AACGGCATGC 


27600 


35 


AGAGTACGAT 


GTACAAGGTC 


GACGTCAATG 


AGCCTTTCGT 


CAACGGCGTC 


CCGATCGGTC 


27660 




GATCCATCAC 


CAACTCTGGG 


GCTTACGTCA 


TGGACGGCAA 


TCAACAGCTC 


GTATCTCCTG 


27720 




GTGTGATGGG 


AGAAATTGTC 


GTTACCGGTG 


ATGGTCTTGC 


CCGTGGCTAT 


ACAGACTCAG 


27780 


40 


CCCTAGACGA 


GGACCGGTTT 


GTTCACGTCA 


CGATCGATGG 


TGAGGAAAAT 


ATCAAGGCAT 


27840 


ACCGAACCGG 


TGATCGAGTC 


CGCTACCGGC 


CCAAGGACTT 


TGAGATTGAA 


TTCTTCGGCC 


27900 




GTATGGATCA 


ACAGGTGAAG 


ATTCGTGGTC 


ACCGCATTGA 


GCCAGCAGAA 


GTGGAACATG 


27960 




CACTGCTCGG 


CCACGACTTG 






UCTTCGAAAG 


CCAGCAAATC 


2 8020 


45 


AAGAACCAGA 


GATGATTGCT 


TTCATCACCA 


GCCAGGAAGA 


CGAGACTATC 


GAGCAGCATG 


26080 




AGTCAAACAA 


GCAGGTCCAA 


GGCTGGGGAG 


AGCATTTCGA 


CGTAAGCAGG 


TATGCTGATA 


28140 




TCAAGGATCT 


CGACACTTCT 


ACCTTTGGTC 


ACGACTTTTT 


GGGATGGACA 


TCTATGTATG 


28200 


50 


ACGGAGTTGA CATTCCTGTC 


AACGAGATGA 


AAGAGTGGCT 


TGATGAAACT 


ACGGCCTCCC 


28260 




TCCTAGACAA 


CCGCCCACCT 


GGTCATATCC 


TCGAGATCGG 


AGCCGGAACT 


GGCATGATTC 


28320 




TATCTAACCT 


GGGCAAAGTC 


GACGGCCTAC 


AGAAGTATGT 


CGGTCTTGAC 


CCGGCTCCCT 


28380 
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CAGCCGCAAT CTTTGTCAAC GAAGCCGTCA 
5 GGGTACTTGT TGGAACTGCC CTGGATATCG 

AGCTTGTGGT TATCAACTCC GTGGCCCAGT 
TGGTCAAAGC TGTTGTGGAA GTGCCCAGCG 
CCCAGGCCCT TAACAGGGAC TTCCTTGCAG 

10 

C T AGCAAAG A GCAGATCCGG GAAAAGATCG 
TCGTGGACCC AGCCTTCTTC GTGAGCTTGA 
AGGTCCTGCC CAAGCTGATG AAGGCCACCA 

15 TTCTACACAT CAGCCACAAC GAAGAGGAGC 

CATGGGTTGA CTTTGCAGCA ACGCAAAAGG 
AAGGACGAGA TGATGTGATG ATCGCGGTCG 

20 AGCGACACAT TATGAACTCT CTTGACCAAG 

GGATCTCAGA TGCTCGATCA GCCGCTGCAA 
CGCAGTTGGC CAAGGAGGAG GGATTCCGGG 
AAAACGGCGC CCTCGATGCC GTTTTCCACC 

25 

GTCGTGTCCT GGTACACTTC CCTACCGACC 
ACCGGCCACT CCAGCGAGCT CAGAGCCGCC 
AGACAGCACT GCCGGCCTAC ATGATCCCAT 

30 

CCAACGCCAA CGGCAAAGTG GACAGGAAAC 
AGAGAAAGGC AGTGTCAGCG CGCGTTGCGC 
AGGAATACGC AGATATCCTA GGAACTGAAG 

35 GCGGGCATTC ACTCATGGCT ACGAAGCTCG 

GGGTTACGGT CAAGGAGGTG TTCGATAAAC 
AACAGGGCTC GACACCTCAT CTGCCTATTG 

m AGTCGTACGC CCAAGGTCGC TTGTGGTTCT 

ATCACATGTC GCTGGCGATG AGGCTGCTCG 
CCTTACGGGC GCTGGAGCAG CGACACGAGA 
AC ATCGGC GT CCAGGTCGTT CATGAGGCCG 

45 

CAGACAAGAA CGAGAAGGAG CACATGGCCG 
CTCTTGCTTC AGAGCCAGGC TGGAAGGGTC 
TCCTCTCCCT CGTCATGCAT CACATGTTCT 
50 AGGAGCTCGG TCAATTCTAC TCAGCCGCTT 

AGCCCCTCCC AATACAATAT CGTGACTTTG 
CCGAGCATGA GAGGCAGCTC GCGTACTGGG 

55 



AGTCTCTGCC 


AAGTCTAGCC 


GGTAAGGCCC 


28440 


GTTCTCTGGA 


CAAGAATGAG 


ATCCAACCTG 


28500 


ACTTCCCCAC 


ATCAGAGTAC 


TTGATCAAGG 


28560 


TCAAGCGTGT 


TTTCTTTGGC 


GATATCAGAT 


28620 


CTCGTGCCGT 


TCGTGCGTTG 


GGTGACAATG 


28680 


CAGAGCTCGA 


AGAGAGCGAA 


GAAGAACTTC 


28740 


GAAGCCAGCT 


GCCCAACATC 


AAGCACGTTG 


2B800 


ACGAGCTGAG 


CTCGTACAGA 


TATGCTGCGG 


2B860 


AGCTGCTCAT 


ACAGGATATC 


GATCCCACAG 


28920 


ACTCTCAAGG 


TCTGAGAAAC 


CTTCTACAAC 


28980 


GGAACATCCC 


GTACAGCAAG 


ACCATAGTGG 


29040 


ATCACGTCAA 


CTCACTCGAC 


GGGACATCCT 


29100 


TCTGCACTTC 


GTTCGACGCA 


CCCGCCCTCA 


29160 


TAGAGTTGAG 


CTGGGCGCGA 


CAGAGATCTC 


29220 


GTCTTGCAAC 


CGATGCAAAT 


TGCGAGCGCA 


29280 


ATCAAGGTCG 


ACAACTTCGA 


ACCCTGACGA 


29340 


GTATCGAGTC 


ACAAGTCTTC 


GAGGCACTGC 


29400 


CGCGCATTAT 


CGTGCTCCCG 


CAGATGCCGA 


29460 


AGCTCGCTCG 


CCGCGCGCAG 


GTTGT GGCCA 


29520 


CTCGTAATGA 


CACCGAGATA 


GTTCTTTGCG 


29580 


TTGGCATCAC 


GGACAACTTC 


TTCGACATGG 


29640 


CAGCCCGACT 


AAGCCGGCGA 


CTAGATACCC 


29700 


CCGTCCTGGC 


TGACCTCGCT 


GCTTCGATCG 


29760 


CCTCATCGGT 


GTATTCCGGA 


CCGGTGGAGC 


29820 


TGGATCAGTT 


CAACCTCAAC 


GCGACGTGGT 


29880 


GGCCGCTCAA 


CATGGACGCG 


CTGGACGTGG 


29940 


CGCTTCGCAC 


AACCTTTGAG 


GCTCAAAAGG 


30000 


GAATGAAGAG 


GCTCAAGGTC 


CTTGACCTAT 


30060 


TGCTAGAGAA 


TGAACAGATG 


AGACCGTTCA 


30120 


ATCTTGCTCG 


CCTTGGCCCC 


ACGGAGTATA 


30180 


CAGACGGCTG 


GTCCGTTGAT 


ATCCTGAGAC 


30240 


TACGTGGCAG 


GGATCCGTTA 


TCTCAGGTCA 


30300 


CGGCTTGGCA 


GAAGGAAGCT 


GCCCAAGTTG 


30360 


AGAACCAGTT 


AGCTGACAGT 


ACTCCCGGTG 


30420 
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15 
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25 
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35 



40 



45 



50 



AGCTTCTGAC CGACTTTCCC CGCCCACAGT TCCTGAGTGG GAAGGCTGGT GTCATCCCGG 
TCACCATTGA GGGGCCGGTC TACGAGAAGC TTCTGAAGTT CTCCAAGGAG CGCCAGGTAA 
CTCTGTTCTC GGTGCTATTA ACAGCGTTCC GGGCCACACA CTTTCGTCTC ACTGGTGCAG 
AGGATGCTAC GATCGGTACC CCAATTGCAA ATCGCAACCG GCCAGAACTC GAGCATATCA 
TTGGATTCTT CGTCAACACC CAATGCATGC GTCTTCTCCT CGATACCGGC AGCACAT TCG 
AATCCCTAGT CCAGCATGTT CGGTCCGTGG CTACAGATGC CTATTCCAAT CAGGATATTC 
CCTTCGAACG GATCGTCTCG GCACTTCTCC CTGGCTCGAG AGATGCCTCA CGAAGCCCAC 
TAATCCAGCT TATGTTTGCC T TGCAC T C AC AGCCAGATCT CGGGAACATT ACTCTCGAAG 



GACTCGAGCA TGAGCGCCTG 
TGTTCCAAGA GCCTAACAAG 
CTGAAACAAT CAAC AGCGT C 



CCAACAAGCG TCGCAACACG TTTCGACATG GAGTTCCACC 
CTGAGTGGTT CAATACTCTT TGCCGATGAG CTCTTCCAGC 
GTGACTGTGT TCCAGGAGAT ACTCCGACGC GGCCTCGACC 



AACCCCAAGT CTCCATTTCT ACTATGCCCC TGACTGATGG GTTGATTGAT CTCGAGAAAC 
TGGGCTTGCT GGAAATCGAG AGCAGCAACT TCCCTCGCGA CTACTCGGTT GTCGACGTCT 
TCCGACAGCA GGTGGCTGCC AATCCAAATG CGCCCGCTGT CGTGGATTCG GAGACATCCA 
TGAGCTACAC CTCGCTAGAT CAGAAGTCTG AGCAGAT TGC TGCCTGGTTA CACGCTCAAG 
GCCTCCGCCC TGAGTCATTG ATCTGCGTGA TGGCGCCACG ATCTTTCGAA ACGATCGTCT 
CCTTATTCGG TATCTTGAAG GCTGGCTACG CCTACCTGCC TCTGGATGTG AATTCCCCTG 
CAGCTCGAAT CCAACCGATC CTATCCGAGG TTGAAGGAAA AAGACTGGTA CTGCTAGGAT 
CAGGGATAGA CATGCCTCAA AGCGACCGAA TGGATGTTGA AACCGCTCGA ATTCAGGACA 
TCCTAACGAA CACAAAGGTC GAGAGATCTG ATCCCATGAG CAGGCCATCG GCAACTAGCC 
TTGCCTATGT CATCTTCACC TCCGGGTCAA CTGGCCGTCC CAAGGGCGTG ATGATCGAGC 
AT CGCAAT AT TCTGCGCCTT GTCAAGCAGT CTAATGTTAC GTCTCAGCTG CCGCAGGATC 
TGCGCATGGC ACATATCTCC AACCTAGCCT TTGACGCGTC CATCTGGGAG ATATTCACGG 

CTTATTTGCA TTGATTACTT CACTTTGCTG GATAGTCAAG 



CAATTTTGAA TGGCGGCGCC 
CCCTCCGGAC GACATTCGAA 



AAAGCCAGGG TCAATGCTAC CCTATTCGCG CCGGCCTTGC 



TCAAAGAATG CCTCAATCAC GCGCCGACCT TGTTTGAGGA TCTCAAAGTG CTCTATATCG 
GTGGCGACCG ACTCGATGCC ACCGACGCGG CCAAAATACA AGCCCTTGTG AAGGGCACGG 
TCTACAACGC GTACGGGCCG ACAGAGAACA CAGTCATGAG CACGATCTAC AGGCTCACAG 
ATGGAGAGTC TTATGCTAAC GGTGTGCCAA TCGGCAATGC TGTGAGCAGC TCTGGCGCTT 
ATATCATGGA CCAAAAGCAG CGCCTCGTTC CTCCCGGTGT TATGGGAGAG CTCGTTGTGA 
GCGGCGATGG CCTCGCCCGT GGCTACACCA ACTCGACCCT CAATGCTGAT CGTTTCGTTG 



ATATTGTCAT CAACGATCAA 
GGCCCAAGGA TGGTAGCATC 



AAAGCCCGCG CATACCGGAC CGGAGATCGC ACTCGTTACC 
GAGTTCTTCG GCCGTATGGA TCAGCAAGTT AAAATCCGTG 



30480 

30540 

30600 

30660 

30720 

30780 

30840 

30900 

30960 

31020 

31080 

31140 

31200 

31260 

31320 

31380 

31440 

31500 

31560 

31620 

31680 

31740 

31800 

31860 

31920 

31980 

32040 

32100 

32160 

32220 

32280 

32340 

32400 
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GTCATCGAGT TGAGCCGGCC GAGGTCGAGC 
5 ATGCAGCAGT TGTTGTTCAG GCGGTGGATG 

CCATGGCCAG CGACAGATTC AGCGAAGGGG 
GGGAAGACCA CTTCGAAAGC ACCGCCTACG 
10 TGGGACGCGA TTTCACTTCA TGGACCTCGA 

AAATGGAGGA GTGGCTTGAC GATACAATGC 
CGTGTGCTGA GATCGGAACA GGTACCGGCA 
GCCTTGAGAG CTATGTCGGT ATAGAGCCTT 

15 

CAGCCCAAGA TTTCCCAGGT CTGCAAGGAA 
ACATCAAGCT GGTCAAGGAC TTCCACCCTG 
ATTTCCCGAG CCGGAGCTAC CTTGTACAGA 

20 

TCAAGACGAT CTTCTTTGGA GATATGCGAT 
CCCGAGCTCT TTACACGCTA GGTGACAAGG 
CCCGACTTGA GGAGAATGAA GACGAGTTGC 

25 CCAGCCAATG GCCCGGCAAG GTCAAGCATG 

GCAATGAACT AAGCTCGTAC CGATATGCTG 
GTAGGAACAG ATATGGCAGG CGTGTCCACT 

^ CGTCGTCTGG CATGGATCGT CACGCCCTCG 

AGACTGTCGC CATCGGCAAC ATCCCTCACA 
CATCCCTGGA TACTGAGGGA GAAGGCATTG 
AATCGGCTAC GAAGGCAATG GCCGCGCGCT 

35 

AGATCGGCCA AGCGGCAGGA TTCAGGGTCG 
ATGGTGCACT GGACGTCGTC TTCCATCATC 
TCAACTTCCC CACAGACTTC GAGCGTCTAC 

40 TGCAGCGCAT CCAGAACCGT CGGTTCGAGT 
TGCCACCTTA TATGGTTCCA TCACGGATCG 
ACAGCAAAGT CGACCGTAAA GAATTGGCAA 

45 CTTCTGCAAC GCGCGTGGCT CCTCGCAACG 
AGGCAGTTCT TGGTGTTACA GTCGGAGTCA 
CCCTGATGGC TACGAAACTG GCCGCCCGTC 
TGAAGGATAT CTTCAACCAA CCAATCCTTC 

50 

CCGCTCCTCA TGAAGCTATT CCCTCCACGC 
CTCAGGGCCG TCTATGGTTC TTGGATCAGC 
CATTAGCGAG TCGCTTGCGA GGCCCGCTTC 

55 



AAGCCATGCT CGGCAATAAG GCTATCCATG 324 60 

GCCAGGAAAC GGAGATGATC GGCTTTGTTT 32520 

AGGAGGAGAT CACCAACCAA GTCCAGGAGT 32580 

CTGGCATTGA GGCCATCGAC CAGGCTACCC 32640 

TGTACAACGG CAACTTGATT GACAAAGCCG 32700 

AATCCCTCCT TGATAAGGAG GATGCCAGGC 327 60 

TGGTTCTATT CAATTTGCCC AAGAACGATG 32820 

CACGGTCTGC AGCCTTGTTC GTCGACAAAG 32880 

AGACGCAAAT CCTTGTCGGC ACAGCCGAGG 32940 

ACGTGGTTGT CATTAACTCG GTAGCCCAAT 33000 

TAGCGAGCGA ACTGATTCAC ATGACCAGCG 33060 

CCTGGGCCAC CAACAGGGAT TTCCTCGTGT 33120 

CTACAAAGGA TCAGATTCGC CAGGAGGTTG 33180 

TTGTTGACCC AGCATTCTTC ACCTCTTTGA 33240 

TTGAGATCTT GCCGAAGCGG ATGAGGACGA 33300 

CGGTGCTACA CATCTGCAGG GATGGGGAGG 33360 

CAGTGGAAGA GAACGCCTGG ATCGACTTCG 33420 

TTCAGATGCT CGATGAACGT AGAGACGCCA 33480 

GCAACACGAT CAACGAGCGA CACTTTACGA 33540 

CCCAAGATTC ACTGGATGGA TCCGCCTGGC 33600 

GTCCTTGCCT TTCCGTCACC GAACTGGTCG 33660 

AGGTCAGCTG GGCTCGTCAA CGATCCCAAC 33720 

TTGAAGATGA CAGAGTAGGC CGCGTCTTGA 33780 

CCCCTAGCAC CGGCCTGACC AGTCGGCCGC 33840 

CGCAGATCCG CGAACAGCTG CAAACACTGC 33900 

TCGTGTTGGA GCGGATGCCT CTCAACGCAA 33960 

GGAAGGCGAG GACCCTACAA ACCATCAAGC 34020 

ATATTGAAGC CGTCTTGTGC GACGAGTTCC 34080 

TGGATAACTT TTTCGAGTTG GGCGGACACT 34140 

TCAGTCGCCG CCTCGACACC CGCGTCTCTG 34200 

AAGATCTCGC GGACGTGGTC CAGACTGGCT 34260 

CCTACTCTGG TCCCGTGGAG CAATCCTTCT 34320 

TGAATCTCAA TGCATCGTGG TACCACATGC 34380 

GGATCGAGGC GCTGCAGTCA GCCCTGGCTA 34440 
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CGATTGAGGC 


GCGGCACGAG 


TCCCTGCGCA 


CCACATTCGA 


GGAGCAAGAT 


GGTGTTCCCG 


34500 


5 


TTCAGATTGT 


ACGCGCTGCG 


CGCAACAAGC 


AGCTGAGGAT 


CATCGACGTG 


TCGGGCACCG 


34560 




AGGATGCGTA 


TCTCGCAGCA 


TTGAAGCAAG 


AGCAAGACGC 


CGCATTCGAT 


CTGACTGCTG 


34620 




AGCCAGGCTG 


GCGAGTAGCA 


CTGTTGCGCT 


TGGGACCGGA 


TGATCATGTC 


CTGTCTATCG 


346B0 


10 


TCATGCACCA 


CATCATATCT 


GACGGATGGT 


CGGTTGATAT 


CCTGCGACAA 


GAACTCGGGC 


34740 




AGCTCTACTC 


GAATGCCTCA 


TCGCAGCCCG 


CTCCTCTTCC 


GATTCAATAC 


CGAGATTTCG 


34800 




CCATCTGGCA 

XV* * V> xv * XV X^ XV ■ * 


GAAGCAGGAT 


AGTCAGATCG 


CTGAGCACCA 


AAAGCAGCTG 


AACTACTGGA 


34860 


15 


AGAGACAACT 


GGTCAACAGC 

vv XV ■* XV* ■* *^ *XV XV 


AAGCCGGCTG 

* ** *>XV WW XV ™ XV 


AGCTCCTGGC 


GGACTTCACT 


CGTCCGAAGG 


34920 




CGTTATCTGG 


CGATGCTGAT 

w XV * * XV w * XV* * mm 


GTCATACCGA 

W ■*> ^bV* * ^» *XV XV x^ * * 


TAGAGATTGA 


TGACCAGGTA 


TATCAGAACC 


34980 




TCCGCTCGTT 


TTGTCGCGCT 

V* A XV V* XV XV XV XV XV 


CGGCATGTCA 

XV ^m* XV XV* * ^» XV ■» xv* * 


CCAGCTTTGT 


TGCACTCTTA 


GCAGCTTTCC 


35040 


20 


GGGCTGCTCA 

V/VWV * XV xv* * 


CTACCGCCTA 

X^ -* * *XV XV XV XV XV * * 


ACTGGGGCCG 

•m *xv • xv xv xv xv xv xv xv 


AAGATGCAAC 


TATCGGCTCT 


CCAATCGCCA 


35100 


ACAGAAATCG 


ACCTGAGCTT 

**WW XV* *XvXv *■ » 


GAAGGCCTCA 

XV ■ ** *XV XV XVXV «* XV* * 


TTGGATGCTT 


TGTTAACACC 


CAGTGTCTCC 


35160 




GAATTCCTGT 

A k* X X A WW * w A 


TAAGAGCGAG 

V> * ** *XV* *XV XV XV* »w 


GACACATTTG 


ACACGTTGGT 


TAAACAGGCA 


CGAGAAACGG 


35220 




CGACCGAGGC 


CCAGGACAAC 

XV XV* A XV XV* SWA *• AW 


CAAGATGTCC 

XV* ** *jXV* * • XV v> XV X^ 


CGTTCGAGAG 


GATCGTTTCT 


TCCATGGTTG 


35280 


25 


CTAGCTCGCG 


AGATACCTCG 

* *XV* * * *X^ XV ^ XV XJ 


CGAAATCCAC 


TCGTTCAGGT 


CATGTTTGCT 


GTGCACTCTC 


35340 




AGCACGACCT 


TGGTAACATT 

v* WW v> «M*W** * 


CGTCTCGAAG 

XvXV -* W ■*> XvXv * ** *xv 


GTGTTGAGGG 


GAAGCCCGTT 


TCGATGGCAG 


35400 




CGTCCACACG 


CTTTGACGCG 

Xv •* *• J- XV*»XvXvXVXV 


GAAATGCACC 

XV* ** ** * *■ XVXv* *XvXv 


TATTTGAGGA 

A * * *■ i*> ■* ^VJ * *XV XV *■ * 


CCAAGGGATG 

XV XV* mm *-^V XV ^V* * XV 


CTCGGCGGCA 


35460 


30 


ACGTCGTCTT 

nv^ V A VvxJ -A X * 


TTCGAAGGAT 


CTGTTCGAAT 

W V> XV V> *> XVXV * ** * 4 


CCGAGACGAT 

XV ^V XV *> *XV* 


CCGCAGTGTT 

XVXVXVXV**XV ^» XV » 


GTGGCCGTGT 

XV ^» XV XV XV ^mr^mr ^ Xtf 


35520 




T CCAGGAG AC 


CC TGAGGCGT 

xv xv -*■ \J**UVJwv v* 


GGCCTAGCCA 

W XV XV XV ■*> * * XV XV XV* * 


ATCCTCACGC 

**^L xvxv v> xv**xvxvxv 


AAATCTCGCA 

* ** ** * XV ^» w » 


ACACTTCCTC 


35580 




TTACCGATGG 


AT TGCCC AGT 

** v> 1 VWWVilW A 


CTTCGAAGCC 

XV v* * xv xy *» **■ *XV XV XV 


TGTGTCTTCA 

V* XV *> XV v* XV V* A XV** 


AGTCAATCAG 

* *XV ^ XV* ** * « XV* *xv 


CCTGACTACC 


35640 


35 


CCCGAGATGC 


CTCCGTGATC 

XV V> XV XV XV *■ XV** V* w 


GACGTTTTCA 


GAGAGCAGGT 


AGCATCGATA 


CCCAAGTCTA 


35700 




TCGCCGTTAT 


CGATGCTTCT 


TCACAGCTCA 

*i XV* *XV* *XV XV V> XV* * 


CCTACACCGA 


GCTCGACGAG 


AGATCTAGCC 


35760 




AGCTCGCCAC 

* *XV XV * vv VV*»w 


GTGGCTACGC 


CGACAAGTCA 


CAGTCCCTGA 


GGAGCTGGTC 


GGCGTCCTCG 


35820 


40 


CTCCACGGTC 

XV * XV x™ *X* XV XV *> XV 


CTGTGAGACA 


ATCATCGCTT 


TCCTCGGCAT 


CATCAAAGCG 


AATCTCGCCT 


35880 


ATCTGCCACT 


TGACGTCAAC 


GCACCCGCTG 


GTCGGATCGA 


GACAATCCTG 


TCATCTCTAC 


35940 




CAGGAAACAG 


GCTTATTTTA 


CTTGGATCAG 


ATACGCAGGC 


GGTCAAGCTT 


CACGCAAACA 


36000 




GCGTTCGATT 


CACCCGGATC 


AGCGACGCCC 


TCGTCGAGAG 


CGGCAGTCCC 


CCTACCGAAG 


V" /\ v* /% 

36060 


45 


AACTTTCCAC 


ACGGCCGACT 


GCACAAAGCC 


TTGCCTATGT 


CATGTTCACA 


TCAGGCTCAA 


36120 




CTGGCGTCCC 


GAAGGGTGTC 


ATGGTAGAGC 


ACCGGGGTAT 


CACACGTCTC 


GTGAAAAACA 


36180 




GCAACGTGGT 


CGCAAAGCAA 


CCGGCAGCAG 


CTGCTATCGC 


TCATCTTTCG 


AACATTGCTT 


36240 


50 


TCGACGCCTC 


TTCCTGGGAG 


ATATACGCTC 


CTCTCCTTAA 


CGGCGGTACA 


GTCGTCTGCA 


36300 




TTGATTACTA 


CACCACGATC 


GATATCAAAG 


CCCTCGAGGC 


GGTATTCAAA 


CAGCACCACA 


36360 




TCCGCGGAGC 


AATGCTTCCA 


CCAGCACTTC 


TCAAACAGTG 


TCTGGTCTCT 


GCCCCTACTA 


36420 
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TGATCAGCTC TCTGGAGATA CTTTTCGCCG 
5 TCCTGGCGCG ACGTGCCGTT GGTTCGGGCG 

CGGTCCTGAG TACGATACAC AACATCGGCG 
TTGGAAACGC TGTCAGTAAC TCCGGTGCCT 
CCGCCGGTGT GATCGGAGAG CTTGTTGTGA 

10 

ATTCTAAGCT TAGGGTGGAT CGATTCATCT 
CTTACCGCAC GGGCGACCGT GTCAGGCACC 
GGCGAATGGA TCAGCAGATC AAGATCCGTG 

15 

AGGCTCTCGC CCGTGACCCG GCCATCAGCG 
AAGAGGAGCC GGAACTGGTG GCTTTCTTCT 
GTGTCAACGG TGTGAGCGAT CAAGAGAAGA 

20 TGGAGAACAA GATCCGTCAC AACCTACAGG 
GGATCATCCA TGTCGATCAG CTGCCGGTCA 
TGGCTGTTCG AGCCCAGGCA ACGCCAAGGA 

25 GCAACGATAT CGAAACCATC ATCTGTAAGG 
GAATCACAGA CAACTTCTTC GACCTGGGTG 
CCCGCCTTAG CCGTCGACTA GATACTCGCG 
TGGTAGGCCA ATTGGCGGCT TCTATCCAGC 

30 

CGTTATCACA CTCGGGACCT GTGCAACAGT 
ACCGTTTCAA TCTCAACGCT GCCTGGTACA 
CTCTCCGAGT CGATGCACTT CAGACTGCAT 

35 

TACGCACCAC GTTCGAAGAA CAGGATGGCG 
TGAGGGACAT CTGCGTCGTA GACATCTCTG 
AGGAGCAGCA AGCTCCTTTC AATCTCTCTA 

40 AGGCTGGAGA GAACCACCAC ATCCTCTCTA 
GGTCAGTTGA CATCTTCCAG CAGGAGCTTG 
ATGACCCCCT TTCCCAGGTC AAACCGCTCC 

45 AGAGACAAGA TAAGCAAGTT GCCGTTCACG 
TCGCGGATAG CACGCCAGCC GAGATCCTAT 
GCGAAGCTGG TACAGTTCCC ATCGTGATCG 
TCTGCCGCAA TCATCAGGTC ACCAGCTTCG 

50 

ATTATCGCCT AACTGGGGCA GAGGATGCGA 
GCCCCGAACT TGAGGACTTG ATCGGTTTCT 
TCGAAGAACA CGATAATTTC CTATCAGTAG 

55 



CCGGCGATCG 


GTTGAGCAGC 


CAAGATGCCA 


36480 


TTTACAACGC 


TTACGGCCCT 


ACTGAGAACA 


36540 


AGAATGAGGC 


ATTTTCGAAT 


GGCGTTCCCA 


36600 


TTGTCATGGA 


TCAAAATCAG 


CAGCTGGTCT 


36660 


CCGGAGATGG 


CCTTGCCCGC 


GGATACACAG 


36720 


ATATTACCCT 


TGACGGGAAC 


CGGGTCAGAG 


36780 


GGCCTAAGGA 


TGGGCAAATT 


GAGTTCTTCG 


36840 


GTCATCGCAT 


CGAGCCAGCA 


GAGGTGGAGC 


36900 


ATTCGGCTGT 


TAT C ACTC AG 


CTCACGGATG 


36960 


CATTGAAGGG 


GAATGCCAAC 


GGCACCAACG 


37020 


TCGACGGCGA 


TGAGCAACAT 


GCTCTGCTGA 


37080 


CGCTGCTGCC 


CACTTACATG 


ATCCCCTCGC 


37140 


ATGCCAACGG 


TAAGATTGAC 


CGCAATGAGC 


37200 


CCAGTTCAGT 


GTCAACCTAC 


GTGGCCCCTC 


37260 


AATTCGCAGA 


TATCCTCAGC 


GTTCGAGTCG 


37320 


GACACTCACT 


TATAGCCACC 


AAGCTAGCCG 


37380 


TGTCTGTTAG 


GGACGTCTTT 


GACACTCCCG 


37440 


AAGGCTCGAC 


CCCTCATGAA 


GCTATTCCGG 


37500 


CCTTTGCTCA 


AGGCCGTCTT 


TGGTTCCTGG 


37560 


TCATGCCATT 


CGGCGTTCGT 


CTTCGCGGAC 


37620 


TGAGGGCTCT 


CGAAGAACGG 


CACGAGTTGC 


37680 


TTGGTATGCA 


AATCGTTCAC 


TCGCCCCGAA 


37740 


GCGCCAATGA 


GGATCTTGCG 


AAGCTGAAGG 


37800 


CTGAAGTCGC 


TTGGAGGGTA 


GCACTCTTCA 


37860 


TCGTCATGCA 


TCACATAATT 


TCAGATGGCT 


37920 


CCCAATTCTA 


CTCGGTAGCT 


GTACGAGGGC 


37980 


CCATTCACTA 


CCGCGATTTT 


GCTGTCTGGC 


38040 


AAAGCCAACT 


TCAGTACTGG 


ATAGAGCAGC 


38100 


CTGATTTTAA 


CCGACCGGAG 


GTCTTGTCCG 


38160 


AGGACGAGGT 


TTATGAGAAG 


CTCTCCCTCT 


38220 


TCGTCCTTCT 


GGCTGCTTTC 


CGCGTCGCAC 


38280 


CTATCGGTAC 


ACCAATTGCG 


AACCGCAACC 


38340 


TTGTCAATAC 


ACAATGCATG 


AGAATCGCGC 


38400 


TGCGAAGAGT 


TCGCTCAACA 


GCGGCAAGCG 


38460 
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CCTTCGAAAA CCAGGATGTG CCATTCGAGC 
GAGATGCCTC CCGGAATCCC CTCGTTCAAC 
TCGGTAAACT GCAACTGGAG GGCTTGGAAG 
. GCTTCGATGT TGAGTTCCAC CTCTTCGAAC 

10 TCGCAGCAGA CTTGTTCGAG GCTGCCACTA 

TCCTCCGTCG TGGTCTCGAC CAGCCAGATA 
GCCTGGCGGC GCTCAACAGC CGTAACTTAC 

15 CCACCGAGGC CTCGGTGGTT GATGTCTTCC 

TGGCTGTGAC CGACACATCC ACAAAGCTTA 
ATGTCGCGGC TTGGCTGTCC AAACAGAAGC 
CGCCACGATC CTCTGAGACT ATCGTAGCAT 

20 

ATCTCCCCAT GGATTCCAAC GTCCCCGAAG 
CAGGGGAGAA GTTCGTTTTG CTTGGAGCAG 
ATGTCAGGAT GGTCTTCATC AGCGATATCG 

25 

CCGGCACTCG GCCATCTGCA TCAAGCCTTG 
GTCGGCCAAA GGGTGTCATG GTCGAGCATC 
CTTCAAGAAT ACCACAAAGT CTGCGGATGG 

30 CCGTGTGGGA GATATTCACC ACGCTGCTCA 

TTACTGTCTT GGACAGCAAA GCACTTTCTG 
CCCTGCTCCC ACCGGCCTTG CTCAAGCAAT 

35 CCCTCGAGTC TCTGTACATT GGAGGCGACC 

AGGACCTCGT CAAAGGCAAG GCCTACAATG 
GCACGATCTA TACCATCGAA CACGAGACTT 
TAGGCCCCAA GTCCAAGGCC TACATTATGG 

40 

TGATGGGAGA GCTTGTCGTT GCTGGCGATG 
TGAACACGGG CCGGTTCATC CACATCACGA 
CCGGCGATCG AG TC AG AT AC CGACCTAGGG 

45 

ATCAGCAGAT CAAGATTCGC GGTCATCGCA 
TCAGCGACTC ATCGATCAAC GATGCCGTTG 
AAATGGTTGG TTACATCACG ACCCAGGCTG 
50 ACAAGGTGCA GGAGTGGGAG GCTCATTTCG 
TTGATCGCGA TGCCCTCGGA CAGGACTTCT 
TGATTCCCCG TGAAGAGATG CAGGAATGGC 

55 



GCC TTGTATC 


TGCACTTCTG 


V«.V^.V_»VJVj\_ J. V- J. £\ 


JO JtU 


TCATGTTTGT 




C AfW5 A A A TP 


O Q C O A 
JOjO U 


GCGAACCAAC 


CCCGTACACr 






aagacaaagc; 




AAT^TTdTPT 


o q -t n n 
Jo / U U 


TCCGCAGCGT 


TGTTGAAfiTP 


i. A LLnUunbn 


00-7 £ a 


TCGCAATTTC 


CACCATGCCA 


v_ 1 Au A wO*AAVj 




CCGCAGTTGA 


AGACATCGAA 


CCTCZACTTCfc 


Jo 0 0 U 


AGACACAAGT 


GGTCGCTAAC 


rr* Af^ATf^r'^ 


jo y f* u 


CATATGCGGA 


GCTGGATCAA 


V~rvri i. ^uwn X V_ 




TACCAGCAGA 


GAGCATCGTC 


RTTf^TTPTTfS 




GCATTGGCAT 


CCTCAAAGCG 


a appTrnr at 


JjltU 


CCCGTCGCCA AGCAATTCTT 




10.1 on 
0 y x 0 u 


GAGTGCCTAT 


TCCTGACAAC 


A A tf2 Af* A /IT* Tft 


O AO Aft 


TCGCCAGCAA 


GACAGACAAG 


1 L.L 1 AL, I lAL 


.Jy JUU 




Wl lU\tnl LA 


1 LuALAb 


O A O /TA 


GGGGTGTTAT 


TTCTTTGGTG 


AAvjC AuAAJL- o 




CACATGTTTC 


CAATCTCGCA 


rp rp/~>/^» » m/ym rp 

1 ILAjAIvjCI I 


on * QA 


ATGGAGGAAC 


GCTTTTCTGT 


AlUAljClALi 


O A C A A 


CCGCTTTCTC 


CGATCATCGC 


Al lAALAitA 


is bUO 


GTCTTGCAGA 


CGCGCCATCT 


va ivLXbAWJl 


J 9 660 


GCCTTGATGG 


AGCTGATGCA 


nLLnnwi vjA 


jy /zu 


CCTACGGTCC 


CACCGAGAAT 


1 wwO- 1 LAI on 


^ mo a 


TTGCGAATGG 


CGTTCCCATC 




O AO AO 


ACCAGGATCA 


GCAGCTCGTA 




J7?U U 


GTCTCGCACG 


AGGGTATACC 


UA1 wWll w/\L 


j y y d u 


TCGATGGCAA 


ACAAGTTCAG 

t 




aft n.9 n 


ACTACCAAAT 


CGAGTTCTTT 


GGCCGTTTAG 


40080 


TCGAGCCAGC 


TGAAGTGGAG 


CAGGCTCTTC 


40140 


TTGTGTCGGC 


ACAAAACAAG 


GAGGGACTCG 


40200 


CACAATCCGT 


CGACAAGGAG 


GAAGCCAGCA 


40260 


ACTCAACTGC 


ATATGCCAAC 


ATCGGGGGTA 


40320 


TATCCTGGAC 


ATCTATGTAC 


GACGGC TCAT 


40380 


TCAACGACAC 


TATGCGCTCA 


CTCCTCGACA 


40440 
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ACCAACCACC 


CGGAAAAGTG 


CTCGAGATCG 


GAACTGGTAC 


CGGTATGGTG 


CTGTTCAATC 


40500 


5 


TCGGCAAGGT 


TGAGGGACTA 


CAGAGCTATG 


CCGGTCTTGA 


GCCCTCGCGC 


TCCGTCACCG 


40560 




CCTGGGTTAA 


CAAGGCAATC 


GAAACTTTCC 


CAAGCCTGGC 


AGGAAGCGCC 


CGAGTCCACG 


40620 




TTGGAACCGC 


CGAGGATATC 


AGCTCCATTG 


ATGGACTTCG 


TTCCGATCTC 


GTTGTGATCA 


40680 


10 


ACTCAGTCGC 


CCAAT ACT TC 


CCAAGTCGAG 


AATATCTCGC 


TGAGCTGACG 


GCCAACTTGA 


40740 


TTCGACTGCC 


CGGCGTTAAG 


CGTATTTTCT 


TCGGTGACAT 


GAGAACGTAT 


GCTACCAATA 


40800 




AGGACTTCTT 


GGTGGCACGA 


GCAGTCCATA 


CCCTAGGGTC 


CAATGCATCG 


AAGGCCATGG 


40860 




TTCGACAACA 


AGTGGCCAAG 


CTTGAAGATG 


ACGAGGAAGA 


GTTGCTTGTT 


GACCCTGCCT 


40920 


15 


TCTTCACCAG 


CCTGAGCGAC 


CAGTTCCCTG 


ACGAAATCAA 


GCATGTCGAA 


ATTCTGCCAA 


40980 




AGAGGATGGC 


CGCGACCAAC 


GAACTCAGCT 


CTTACAGATA 


TGCTGCTGTC 


ATTCATGTGG 


41040 




GAGGCCACCA 


GATGCCGAAT 


GGGGAGGATG 


AGGATAAGCA 


ATGGGCTGTC 


AAGGATATCA 


41100 


20 


AT CCGAAGGC 


CTGGGTGGAC 


TTTGCTGGCA 


CGAGGATGGA 


CCGTCAGGCT 


CTCTTGCAGC 


41160 




TCCTTCAGGA 


CCGCCAACGT 


GGCGATGACG 


TTGTTGCCGT 


CAGTAACATC 


CCATACAGCA 


41220 




AGACCATCAT 


GGAGCGCCAT 


CTGTCTCAGT 


CACTTGATGA 


TGACGAGGAC 


GGCACTTCAG 


41280 


25 


CGGTAGACGG 


AACGGCCTGG 


ATATCGCGTA 


CGCAATCACG 


GGCGAAGGAA 


TGCCCTGCTC 


41340 




TCTCAGTGGC 


CGACCTGATT 


GAGATTGGTA 


AGGGGATCGG 


CTTCGAAGTT 


GAGGCCAGCT 


41400 




GGGC TCGAC A 


ACACTCCCAG 


CGCGGCGGAC 


TCGATGCTGT 


TTTCCACCGA 


TTCGAACCAC 


41460 




CAAGACACTC 


AGGTCATGTC 


ATGTTCAGGT 


TCCCGACTGA 


ACACAAGGGC 


CGGTCTTCGA 


41520 


GCAGTCTCAC 


GAATCGCCCG 


CTACACCTGC 


TTCAGAGCCG 


CCGACTGGAG 


GCAAAGGTCC 


41580 




GCGAGCGGCT 


GCAATCACTG 


CTTCCACCGT 


ACATGATTCC 


GTCTCGGATC 


ACGTTGCTCG 


41640 




ATCAGATGCC 


TCTCACGTCC 


AACGGCAAGG 


TGGATCGCAA 


GAAGCTTGCT 


CGACAAGCCC 


41700 


35 


GGGTCATCCC 


AAGAAGTGCG 


GCAAGCACGT 


TGGACTTTGT 


GGCGCCACGC 


ACGGAAATCG 


41760 




AAGTCGTCCT 


CTGCGAAGAA 


TTTACCGATC 


TACTAGGCGT 


CAAGGTTGGC 


ATCACAGACA 


41820 




ACTTCTTCGA 


GTTGGGCGGC 


CATTCGCTGC 


TGGCCACGAA 


ACTGAGCGCA 


CGTCTAAGTC 


41880 


40 


GCAGACTGGA 


CGCCGGTATC 


ACTGTGAAGC 


AGGTCTTTGA 


CCAGCCAGTA 


CTTGCTGATC 


41940 




TTGCTGCTTC 


TATTCTTCAA 


GGCTCGTCTC 


GTCACAGGTC 


TATCCCGTCT 


TTACCCTACG 


42000 




AAGGACCCGT 


GGAGCAGTCC 


TTTGCCCAGG 


GGCGCCTGTG 


GTTCCTCGAC 


CAGTTCAACA 


42060 




TCGATGCCTT 


GTGGTACCTT 


ATTCCATTTG 


CACTCCGCAT 


GCGCGGGCCG 


CTGCAAGTTG 


42120 




ACGCCCTCGC 


TGCTGCCCTG 


GTGGCACTTG 


AAGAGCGTCA 


TGAATCTCTG 


CGCACAACGT 


42180 




TTGAGGAACG 


AGACGGAGTC 


GGCATCCAAG 


TGGTGCAACC 


CCTCCGCACG 


ACCAAGGATA 


42240 


50 


TCCGGATCAT 


CGACGTGTCA 


GGCATGCGAG 


ACGACGACGC 


CTACCTCGAG 


CCATTGCAGA 


42300 


AAGAACAGCA 


GACTCCTTTC 


GACCTTGCTT 


CAGAGCCTGG 


CTGGAGGGTA 


GCACTGCTGA 


42360 




AGCTTGGAAA 


GGATGACCAC 


ATCCTCTCTA 


TTGTCATGCA 


CCACATCATC 


TCTGACGGGT 


42420 




GGTCTACTGA 


AGTCTTGCAA 


AGGGAACTCG 


GTCAATTCTA 


CTTGGCAGCG 


AAATCCGGGA 


42480 
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10 



15 



20 



25 



30 



35 



40 



45 



AAGCCCCCTT 
AGAGACAAGA 
TTGCGGACAG 
GAGAGGCAGG 
TCTGCCGGTC 
ACTATCGAAT 
GGCCTGAGCT 
TCGGCGACGA 
CCTTCGAGAA 
GGGATACGTC 
TGGGCAGGAT 
GGTTCGATCT 
TTGCCACGGA 
TTCTACAGCG 
GCATCGCTCA 
ACGCGTCCCT 
TCACTGACTC 
CTTCCTATCT 
GCTCTTGTGA 



ATCGCAGGTT 
GGAACAGGTC 
TAGCCCGGCT 
CAGCGTGTCT 
TCGCCAAGTA 
GACCGGGTCA 
TGAAAACTTG 
TGAGACGTTT 
TCAAGACGTT 
CCGAAACCCC 
CCAGCTCGAC 
CGAATTCCAC 
CCTGTTCCAG 



50 



CTCTCGACGT 

GTAGGCTCAT 

AGACGATGCT 

TGGTTGTCCG 

GCAAGCCAAA 

ACGTCGTGAC 

ACGTTTCACT 

ACTACCTGAC 

GCGCAGCCAT 

TCGGCATGTT 

CAACCCAGGC 

TCCTTAGCAC 

GTAGCGCTGT 

CCGGTGTGAT 



TGGCCTGGAG 
GCTCCGAGAT 
CGTCGATGTC 
GACCTCCAAG 
GCGTCGGCAG 
GACCATCATC 
CAACACGCCA 
CTTGGTTGGC 
GAT CAGCG AC 
ACCCAGTGCT 
GGGTGTCATG 
TCACATGCCA 



GCCCCGCTTC 
GCTGAGAGTC 
GAGCTCTTGG 
TTCGTGATCA 
ACCACCTTTA 
GACGACGCAA 
ATCGGCTGCT 
GAATCACTGG 
CCGTTTGAAC 
CTAGTACAGC 
GGTGTCGTCG 
GCCTTCCAAG 
CCCGAGACCA 
CAGCCGCAGA 
GCCGGCGCGC 



CTATTCAGTA TCGCGATTTT GCTGTTTGGC 
AAAGGCAGCT CGACTACTGG AAGAAGCAGC 
CTGACTACAC CAGGCCGAAC GTACTGTCTG 
ACGATTCGGT TTACAAGAGC CTCGTCTCCT 
CGACTTTACT GGCAGCGTTT CGCGCCGCTC 



CTATTGGCAC 
TCGTCAATAC 
TACAACAGGT 
GAATCGTTTC 
TTCTCTTTGC 
ATGAGCCGGT 



AGGCCGACCG 
TCCAAGGTTT 



GTTCGAGATG 

CCTTCTTGAC 

CTTCCCGCCA 

AGAGGCTGTT 

ACTGGCCGGG 

GATATATAAC 

CAGCAATTCA 

GGGAGAGCTG 



TTCCAGCAGC 
CTGACGTATG 
CAACTCCCGG 
GCGTTCCTAG 
TCTGCTCGCA 
TCGGGCGTCC 
ACGGTTACCG 
ACAAGTCTCG 
GTGGAGCACC 
CCAGCGACAC 
TGCGCAACGC 
AGCACCATGC 
GCACTCCTGC 
TACGTTGCCG 
CCTCGTGTGT 
ATCGATAAGC 
GGGGCCTATG 
GTTGTTACAG 



GTCCCATCGC 
TGCAGATGCC 
AGGCTATGGC 
CCGAGCTGGA 
CGGAGACAAT 
CTATTCTCAA 
TGGAAGCCAT 
GCCATGCTGA 
GGACTGATGC 
CATATGTCAT 



GTGCTATCAT 
GGATGGCCCA 
TCCTCAACGG 
TCCGGGAGAC 
GACAGTGCTT 
GTGATCGCTT 
ACAACGCGTA 
ACGATCCGTA 
TCATGGATCG 
GAGAGGGTGT 



GCCAATTGCC 
CCAGTGCATG 
ACGGTCTACC 
CACCCTCAGT 
GGTTCATTCT 
TCTGTCGACC 
GCTCAATGGA 
CGTTGCGGTT 
AACCATGCCG 
AAAGTCTGAT 
CAGCCCGTCA 
TCGACTCTCC 
GGTGGCCGTT 
AGCAAATCTT 
CATATCGTCC 
TATCAACGTA 
TATTGGCACT 
CTTCACTTCA 
GCGCCTTGTG 
CGTCACGAAT 
CGGAACTCTA 
GTTTGAGCGT 
GGTCAACATG 
CCACTCCCGC 
TGGCCCAACT 
TGTGAACGGT 
GAACCAGCAG 
AGCTCGCGGC 



AATCGCAACA 
CGTATCACTA 
ACCGCGACAG 
GCCGGGTCCA 
CAACAAGGCC 
GTTTCGACTC 
AGTGTCATGT 
GTCGAAGAGG 
CTGGCCGAAG 
TACCCTCGCA 
ACTGTCGCCG 
GATCAAGCTG 
CTCGCACCGC 
GCCTACATGC 
GTCCCAGGGC 
CCGAACGCAA 
CCCGAACCTC 
GGGTCAACGG 
AAGGACAGTA 
ATCGCATTCG 
GTCTGCATTG 
GAGCAGGTTC 
CCCGATGCGA 
GACGCCCGCG 
GAGAACGCAA 
GTTCCTATCG 
CTTCTCCCTC 
TATACCGACG 



42540 

42600 

42660 

42720 

42780 

42840 

42900 

42960 

43020 

43080 

43140 

43200 

43260 

43320 

43380 

43440 

43500 

43560 

43620 

43680 

43740 

43800 

43860 

43920 

43980 

44040 

44100 

44160 

44220 

44280 

44340 

44400 

44460 
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CAAGTCTCGA TACGGACCGC TTCGTCACCG TCACGATCGA TGGCCAGCGC CAGAGGGCGT 



5 



10 



15 



20 



25 



30 



35 



40 



45 



50 



ACCGCACGGG TGACCGGGTG CGATATCGAC 

ATTCGCGGCC 



GCCTGGACCA 
CTCTGCTCAG 
AGGACCCGCA 
ACGAAGAGGA 
CGCTTCTGCC 
ACGCCAACGG 
CAAGCAGCTC 
ACGAGTTCGA 
GCGGGCACTC 
GCATATCCGT 
AACAACAGCA 
AACTCCTCCC 
AGAACGGTCA 
TGCACGACAA 
AGACCGCCGA 
TATTCAGAAC 
TCGATGTACC 
TGC AT GAAGC 
TCCTCAAGAG 
ACGGCTTGAG 
TTGCGCAAGC 
GCTACAATTT 
TCGGCGCCCT 
GGATCCCTCC 
TCTCTCTCTT 
CTGGACGACA 



GCAGGCCAAG 
CGAGAATTCA 
ACTGGTTGCC 
GGATCCGTAC 
AT ACT ACATG 
CAAGGTGGAC 
GGGCCCCGTG 
GACTATACTC 
ACTCCTGGCC 
CAAGGATCTG 
GGGGTTCTCG 
CGCGGAAATG 
CAGCACACCC 
AGCGACGGGC 
CTGCCGTCGT 
CGTGTTCGTG 
TGTCGAGGTC 
AGACAAGCAG 
ACCGGGAGCC 
TCTTGAACAC 
ACCCAAGTTT 
CTGGCGATCT 
CGAGGCCATG 
TGCGGCACTC 
GCTCGCCAAG 
GGATCTCTCC 



CAAAGGGATT CCAGATAGAG 
ACCGTGTTGA ACTGGGCGAG 
CGGCTGTCGT ACTCCGCACC 



TTCTTCGGCC 
GTCGAACATG 
ATGGAAGAGG 



GTCACGGATG 

TTTGTGACTA CTGATCACGA ATATCGCTCG GGTTCGAGCA 



GCCACACAGG 

GTCCCGTCCC 

CGAAAAGACC 

CATGTGGCTC 

GGAGTCAAGG 

ACCAAACTCG 

TTTGACGATC 

GGAGAAGATG 

TCGAGAGAGA 

CTGGACATGT 

CACCCAGCCA 

CTGGCAAGCG 

TCAAGAGGCG 

ATCGAGACCG 



CAGCAGGCGA TATGCGCAAG 
GGGTCACAAT ACTCAGGCAA 



TCGCTCGGCG 
CTCGCAACGA 
TGGGAATCAC 
CTGCTCGGCT 
CTGTTCCTGT 
AAAGCTCGAC 



GGCCCAGATG 
GACTGAGGCA 
AGACAACTTC 
CAGCCGCCGG 
TTCTCTCGCC 
AGTTGGTATT 



TCATCCAGCG CGATGTTGTA 
ATCCAGCCAC GCAGACGCAG 
CGCCGCCACT GTTCTCCTTG 
CCTGCGCCGC TCTCGTCCAG 
GCCGCTTCTA CCAAGTTGTT 
AGCAAGAGTT GGATGAGGTT 



CGACTCCGGT 

ATGCCTCTCA 

ACTCCGACAG 

GCAATTTGCG 

TTCGAACTAG 

ATGGGCCTTC 

GGCAAGCTGG 

GTCCCCTTCC 

CCTCAGATTG 

ATCTTCTTCC 

GACTTCCCCG 

CACTTTGACA 

CTTGCTCATC 

GCTCTCGCGC 



CAGCCCCTAC GTCTGGGACG TGCGATGCTG CGGATCGCCA 
AAGATGCGAC TTGTTCTCCG AATGTCTCAT TCCCTGTACG 



ATCGTCAACG CTCTACATGC 
GGTCTCTACA TGCATCACAT 
ATTCTTCAGG GCT CTTCAAT 
ACGCCGTCTG CCGGTACATG 
AAGAACGGCA TTACGCAGGC 
CATACCAAGT CGACAGACGT 



CTTGTACAGT GATAAGCACC 
GGCTAGCCGA CGTGCAGAGG 
GACATCCCTG AAGCGCTCTG 
GCAGACGTCA AAGTCCATCA 
GACCCTCTTC ACCGCCGCCG 
CGTCTTCGGC CGCGTCGTAT 



TGCCTGTGCG CGTTCGGATC 
AAGACCAGTA CACCAGCAGC 
ACTGCACGGA CTGGACTGAT 



AT AAACTGCC AAGACATCGT GGGACCTTGC ATCAACGAGG 
GACGAGGGCG ACGACATGGG TGGTCTGCTG CGCGCCATTC 
TTCCGGCACG AGACCTTGGG CTTGCAAGAA GTGAAGGAGA 
GCGACCAAGG AGTTCAGTTG CTGCATTGCC TTCCAGAACC 



TCAACCTGCA TCCTGAGGCC GAGATTGAAG GGCAGCAGAT TCGCCTGGAG GGTTTGCCAG 
CAAAGGATCA AGCACGCCAG GCCAATGGTC ATGCCCCAAA TGGCACGAAC GGCACGAATG 
GCACGAATGG CACGAATGGC GCGAACGGCA CGAATGGCAC GAATGGCACG AATGGTACCC 



44520 

44580 

44640 

44700 

44760 

44820 

44880 

44940 

45000 

45060 

45120 

45180 

45240 

45300 

45360 

45420 

45480 

45540 

45600 

45660 

45720 

45780 

45840 

45900 

45960 

46020 

46080 

46140 

46200 

46260 

46320 

46380 

46440 

46500 
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ATGCCAACGG TATCAATGGT AGCAACGGTG TCAATGGCCG CGATAGCAAC GTGGTTTCAG 
CCGCTGGCGA TCAAGCTCCT GTTCACGATC TGGACATTGT TGGGATTCCG GAGCCCGACG 
GCAGCGTCAA GATTGGCATT GGTGCGAGCC GGCAGATCCT TGGAGAGAAG GTCGTGGGCA 
GCATGCTCAA TGAACTTTGC GAGACCATGC TCGCTTTGAG CAGAACATAG CAGCTTTTCC 
AGGGAGATTG GTTGGATGGA CAAGATTCTC TTCAATTATG GAGGTTGGCA TGAGGCAACA 
GGAGGACTAC TGACTTTTCA TGTTTTTTGG GGTTTTTTGG GGTTTTCTTT TTCCTTTCAT 
CTTTACTTGA TGCGCGATGT CTGCTTTCCT CTAGAATTC 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15281 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tolypocladium niveum 

(B) STRAIN: ATCC 34921 



46560 
46620 
46680 
46740 
46800 
46860 
46899 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Gly Ala lie Gly Gin Asp Met Ala Tyr Asp Arg Leu Ala Asn Pro 
1 5 10 15 

Ser Arg Ala Ser Ser lie Ser Ser Asn Arg Tyr Ser Glu Pro Val Glu 

20 25 " 30 

Gin Ser Phe Ala Gin Gly Arg Leu Trp Phe Leu His Gin Leu Lys Leu 
35 40 45 

Gly Ala Ser Trp Asp He Thr Pro Ala Ala He Arg Leu Arg Gly His 
50 55 60 

Leu Asp He Asp Ala Leu Asn Ala Ala Ser Arg Ala Leu Thr Gin Arg 
65 70 75 80 

His Glu Thr Leu Arg Thr Thr Phe Lys Glu Gin Asp Gly Val Gly Val 

85 90 95 

Gin Val Val His Ala Ser Gly Leu Glu Arg Gly Leu Arg He Val Asp 

100 105 HO 

Ala Ser Ser Arg Asp Leu Ala Gin Leu Leu Ala Glu Glu Gin Thr Met 
115 120 125 

Lys Phe Asp Leu Glu Ser Glu Pro Ala Trp Arg Val Ala Leu Leu Lys 
130 135 140 

Val Ala Glu Asp His His He Leu Ser He Val Val His His He He 
145 150 155 160 

Ser Asp Ser Arg Ser Leu A3p He He Gin Gin Glu Leu Gly Glu Leu 
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20 



25 



30 



35 



40 



45 



50 



55 



Tyr Thr Ala Ala Ser Gin Gly Lys Ser He Ser Ala Cys Pro Leu Gly 

180 185 " 190 

Pro He Pro He Gin Tyr Arg Asp Leu Thr Thr Trp Gin Asn Gin Asp 
1 95 200 205 

Glu Gin Val Ala Glu Gin Glu Arg Gin Leu Gly Tyr Trp He Glu Gin 
210 215 220 

Leu Asp Asn Asn Thr Pro Ala Glu Leu Leu Thr Glu Leu Pro Arg Pro 



230 



235 



240 



Ala He Pro Ser Gly Glu Thr Gly Lys He Ser Phe Gin He Asp Glv 

245 250 255 

Ser Val His Lys Glu Leu Leu Ala Phe Cys Arg Ser Gin Gin Val Thr 

260 265 " 270 

Ala Tyr Ala Val Leu Leu Ala Ala Phe Arg Val Ala His Phe Arg Leu 
275 280 285 

Thr Gly Ala Glu Asp Ala Thr He Gly Ala Pro Val Ala Asn Arg Asp 
290 295 300 

Arg Pro Glu Leu Glu Asn Met Val Ala Pro Leu Ala Thr Leu Gin Cys 

310 



315 



320 



Met Arg Val Val Leu Asp Glu Asp Asp Thr Phe Glu Ser Val Leu Arc* 

325 330 335 

Gin He Met Ser Val Met Thr Glu Ala His Ala Asn Arg Asp Val Pro 



340 



345 



350 



Phe Glu Arg He Val Ser Ala Leu Leu Pro Gly Ser Thr Asp Thr Ser 
355 360 365 

Arg His Pro Leu Val Gin Leu Met Phe Ala Leu His Pro Ala Gin Asp 
370 375 380 P 

Thr Gly Arg Ala Arg Trp Gly Phe Leu Glu Ala Glu Thr Leu Gin Ser 



390 



395 



400 



Ala Ala Pro Thr Arg Phe Asp Met Glu Met His Leu Phe Glu Gly Asp 

405 * A 



410 



415 



Asp Arg Phe Asp Ala Asn Val Leu Phe Ser Thr Gly Leu Phe Asp Ala 



425 



430 



Glu Ala lie Arg Ser Val Val Ser He Phe Arg Glu Val Leu Arg Aro 
435 440 445 * * 

Gly lie Ser Glu Pro Ala Val His Val Lys Thr Met Pro Leu Thr Asp 

455 



460 



Gly Leu Ala Ala He Arg Asp Met Gly Leu Leu Asp He Gly Thr Thr 



470 



475 



480 



Asp Tyr Pro Arg Glu Ala Ser Val Val Asp Met Phe Gin Glu Gin Val 

485 490 495 

Ala Leu Asn Pro Ser Ala Thr Ala Val Ala Asp Ala Ser Ser Arg Leu 

500 505 510 

Ser Tyr Ser Glu Leu Asp His Lys Ser Asp Gin Leu Ala Ala Trp Leu 



520 



525 
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Arg Arg Arg Gin Leu Lys Pro Glu Thr Leu He Gly Val Leu Ser Pro 
530 535 540 

Pro Ser Cys Glu Thr Met Val Ser Phe Leu Gly He Leu Lys Ala His 
545 550 555 560 

Leu Ala Tyr Leu Pro Leu Asp He Asn Val Pro Leu Ala Arg He Glu 

565 570 575 

Ser He Leu Ser Ala Val Asp Gly His Lys Leu Val Leu Leu Gly Ser 

580 585 " 590 

Asn Val Pro Gin Pro Lys Val Asp Val Pro Asp Val Glu Leu Leu Arg 
595 600 605 

He Ser Asp Ala Leu Asn Gly Ser Gin Val Asn Gly Leu Ala Gly Lys 
610 615 620 

Gin Ala Thr Ala Lys Pro Ser Ala Thr Asp Leu Ala Tyr Val He Phe 
625 630 635 ^ 640 

Thr Ser Gly Ser Thr Gly Lys Pro Lys Gly Val Met lie Glu His Arg 

645 650 655 



Gly He Val Arg Leu Val Lys Gly Thr Asn He He Ser Pro Ala Gin 

660 665 670 

Ala Ala Val Pro Thr Ala His Leu Ala Asn He Ala Phe Asp Leu Ser 
675 680 685 

Thr Trp Glu He Tyr Thr Pro He Leu Asn Gly Gly Thr Leu Val Cys 
690 695 700 

He Glu His Ser Val Thr Leu Asp Ser Lys Ala Leu Glu Ala Val Phe 
705 710 715 720 

Thr Lys Glu Gly He Arg Val Ala Phe Leu Ala Pro Ala Leu He Lys 

725 730 7,35 

Gin Cys Leu Ala Asp Arg Pro Ala He Phe Ala Gly Leu Asp Ser Leu 

740 745 750 

Tyr Ala He Gly Asp Arg Phe Asp Arg Arg Asp Ala Leu His Ala Lys 
755 760 - 765 

Ser Leu Val Lys His Gly Val Tyr Asn Ala Tyr Gly Pro Thr Glu Asn 
770 775 780 

Ser Val Val Ser Thr He Tyr Ser Val Ser Glu Ala Ser Pro Phe Val 
785 790 795 800 

Thr Gly Val Pro Val Gly Arg Ala He Ser Asn Ser Gly Ala Tyr Val 

805 810 815 

Met Asp Gin Asp Gin Gin Leu Val Ser Pro Gly Val Met Gly Glu Leu 

820 825 830 

Val Val Ser Gly Asp Gly Leu Ala Arg Gly Tyr Thr Asp Ser Ala Leu 
835 840 845 

Asp Lys Asn Arg Phe Val Val Val Gin He Asp Gly Glu Ser He Arg 
850 855 860 

Gly Tyr Arg Thr Gly Asp Arg Ala Arg Tyr Ser Leu Lys Gly Gly Gin 
865 870 875 ~ 880 
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He Glu Phe Phe Gly Arg Met Asp Gin Gin Val Lys He Arg Gly His 

885 890 895 

Arg He Glu Pro Ala Glu Val Glu His Ala Leu Leu Asn Ser Asp Gin 

900 905 910 

Val Arg Asp Ala Ala Val Val He Arg Arg Gin Glu Glu Glu Glu Pro 
915 920 925 

Ala Met He Ala Phe Val Thr Thr Gin Gly Thr Leu Pro Asp His Leu 
930 935 940 

Val Asn He Asn Gly Asn Gly His Val Pro Asp Gly Asn Gly Ser Lvs 
945 950 955 960 

Asn Asp Gin Phe Ala Val His Val Glu Ser Glu Leu Arg Arg Arg Leu 

965 970 975 

Gin Met Leu Leu Pro Ser Tyr Met Met Pro Ala Arg He Val Val Leu 

980 985 990 

Asp His Leu Pro Leu Asn Pro Asn Gly Lys Val Asp Arg Lys Ala Leu 
995 1000 1005 

Gly Gin Ser Ala Lys Thr Val Gin Lys Ser Lys Leu Val Ser Gin Ara 
1010 1015 1020 

Val Ala Pro Arg Asn Glu He Glu Ala Val Leu Cys Glu Glu Tvr Ara 
1025 1030 1035 1040 

Ser Val Leu Gly Val Glu Val Gly He Thr Asp Asn Phe Phe Asp Leu 

1045 1050 1055 

Gly Gly His Ser Leu Thr Ala Met Lys Leu Ala Ala Arg He Ser Gin 

1060 1065 1070 

Arg Leu Asp He Gin Ala Ser Val Ala Thr Val Phe Glu Gin Pro Met 
1075 1080 1085 

Leu Ala Asp Leu Ala Ala Thr He Gin Arg Gly Ser Thr Leu Tvr Ser 
1090 1095 HOO 

Val He Pro Thr Thr Glu Tyr Thr Gly Pro Val Glu Gin Ser Phe Ala 
H05 1110 ins H20 

Gin Gly Arg Leu Trp Phe Leu Glu Gin Leu Asn Thr Gly Ala Ser Trp 

1125 H30 H35 

Tyr Asn Val Met Leu Thr Val Arg Leu Arg Gly His Leu Asp Val Asp 

1140 H45 H50 

Ala Leu Gly Thr Ala Leu Leu Ala Leu Glu Lys Arg His Glu Thr Leu 
H55 H60 1165 

Arg Thr Thr Phe Glu Glu Arg Asp Gly Val Gly Met Gin Val Val His 
1170 1175 H80 

Ser Ser Leu Met Gly Glu Leu Arg Leu He Asp He Ser Glu Lvs Ser 
1185 H90 H95 1200 

Gly Thr Ala Ala His Glu Ala Leu Met Lys Glu Gin Ser Thr Arg Phe 

1205 1210 1215 

Asp Leu Thr Arg Glu Pro Gly Trp Arg Val Ala Leu Leu Lys Leu Ala 

1220 1225 1230 

Asp His His lie Phe Ser He Val Met His His He Val Ser Asp Gly 
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1235 1240 1245 

Trp Ser Leu Asp Leu Leu Arg His Glu Leu Gly Gin Leu Tyr Ser Ala 
1250 1255 1260 

Ala Leu Arg Gly Gin Asp Pro Leu Ser Arg Leu Glu Pro Leu Pro He 
1265 1270 1275 1280 

Gin Tyr Arg Asp Phe Ala Val Trp Gin Lys Gin Asp Ser Gin Gin Lys 

1285 * 1290 1295 

Ala Ala His Gin Arg Gin Leu Glu Tyr Trp Thr Lys Gin Leu Ala Asp 

1300 1305 1310 

Ser Thr Pro Ala Glu Leu Leu Thr Asp Phe Pro Arg Pro Ser He Leu 
1315 1320 1325 

Ser Gly Lys Ala Gly Lys Val Pro Val Ala He Glu Gly Ser Leu Tyr 
1330 1335 1340 

Asp Thr Leu Gin Val Phe Ser Arg Thr His Gin Val Thr Ser Phe Ala 
1345 1350 1355 1360 

Val Leu Leu Ala Ala Phe Arg Ala Ala His Phe Arg Leu Thr Gly Ser 

1365 1370 " 1375 

Asp Asn Ala Thr lie Gly Val Pro Ser Ala Asn Arg Asn Arg Pro Glu 

1380 1385 1390 

Leu Glu Asn Val He Gly Phe Phe Val Asn Thr Gin Cys He Arg He 
1395 1400 1405 

Thr He Asp Glu Asn Asp Asn Phe Glu Ser Leu Val Arg Gin val Arg 
1410 " 1415 1420 

Ser Thr Thr Thr Ala Ala Gin Asp Asn Gin Asp Val Pro Phe Glu Gin 
1425 1430 1435 1440 

Val Val Ser Ser Leu Met Pro Ser Ser Ser Arg Asp Ala Ser Arg Asn 

1445 1450 1455 

Pro Leu Val Gin Leu Met Phe Ala Leu His Gly Gin Gin Asp Leu Phe 

1460 1465 1470 

Lys He Gin Leu Glu Gly Thr Glu Glu Glu Val He Pro Thr Glu Glu 
1475 1480 1485 

Val Thr Arg Phe Asp He Glu Phe His Leu Tyr Gin Gly Ala Ser Lys 
1490 1495 1500 

Leu Ser Gly Asp He He Phe Ala Ala Asp Leu Phe Glu Ala Glu Thr 
1505 1510 1515 1520 

He Arg Gly Val Val Ser Val Phe Gin Glu Val Leu Arg Arg Gly Leu 

1525 1530 1535 

Gin Gin Pro Gin Thr Pro He Met Thr Met Pro Leu Thr Asp Gly He 

1540 1545 1550 

Pro Glu Leu Glu Arg Met Gly Leu Leu His Met Val Lys Thr Asp Tyr 
1555 1560 1565 

Pro Arg Asn Met Ser Val Val Asp Val Phe Gin Gin Gin Val Arg Leu 
1570 1575 1580 

Ser Ala Glu Ala Thr Ala Val He Asp Ser Ser Ser Arg Met Ser Tyr 
1585 1590 1595 1600 
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Ala Glu Leu Asp Gin Arg Ser A3p Gin Val Ala Ala Trp Leu Arg Gin 

1605 1610 1615 

Arg Gin Leu Pro Ala Glu Thr Phe Val Ala Val Leu Ala Pro Arg Ser 

1620 1625 1630 

Cys Glu Ala Val lie Ala Leu Phe Gly lie Leu Lys Ala Gly His Ala 
1635 1640 1645 

Tyr Leu Pro Leu Asp Val Asn Val Pro Ala Ala Arg Leu Arg Ala lie 
1650 1655 1660 

Leu Ala Glu Val Lys Gly Glu Lys Leu Val Leu Leu Gly Ala Gly Glu 
1665 1670 1675 1680 

Pro Ser Pro Glu Gly Gin Ser Pro Glu Val Ser He Val Arg He Ala 

1685 1690 1695 

Asp Ala Thr Ser Pro Ala Gly His Ala Ser Leu Arg Asp Gly Lys Ser 

1700 1705 1710 

Lys Pro Thr Ala Gly Ser Leu Ala Tyr Val He Phe Thr Ser Gly Ser 
1715 1720 1725 

Thr Gly Lys Pro Lys Gly Val Met He Glu His Arg Gly Val Leu Arg 
1730 1735 1740 

Leu Val Lys Gin Thr Asn He Leu Ser Ser Leu Pro Pro Ala Gin Thr 
1745 1750 1755 1760 

Phe Arg Met Ala His Met Ser Asn Leu Ala Phe Asp Ala Ser He Trp 

1765 1770 1775 

Glu Val Phe Thr Ala Leu Leu Asn Gly Gly Ser Leu Val Cys He Asp 

1780 1785 1790 

Arg Phe Thr He Leu Asp Ala Gin Ala Leu Glu Ala Leu Phe Leu Arg 
1795 1800 1805 

Glu His He Asn He Ala Leu Phe Pro Pro Ala Leu Leu Lys Gin Cys 
1810 1815 1820 

Leu Thr Asp Ala Ala Ala Thr He Lys Ser Leu Asp Leu Leu Tyr Val 
1825 1830 1835 1840 

Gly Gly Asp Arg Leu Asp Thr Ala Asp Ala Ala Leu Ala Lys Ala Leu 

1845 1850 1855 

Val Lys Ser Glu Val Tyr Asn Ala Tyr Gly Pro Thr Glu Asn Thr Val 

1860 1865 1870 

Met Ser Thr Leu Tyr Ser He Ala Asp Thr Glu Arg Phe Val Asn Gly 
1875 18B0 1885 

Val Pro He Gly Arg Ala Val Ser Asn Ser Gly Val Tyr Val Met Asp 
1890 1895 1900 

Gin Asn Gin Gin Leu Val Pro Leu Gly Val Met Gly Glu Leu Val Val 
1905 1910 1915 1920 

Thr Gly Asp Gly Leu Ala Arg Gly Tyr Thr Asn Pro Ala Leu Asp Ser 

1925 1930 1935 

Asp Arg Phe Val Asp Val He Ala Arg Gly Gin Leu Leu Arg Ala Tyr 

1940 1945 1950 
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Arg Thr Gly Asp Arg Ala Arg Tyr Arg Pro Lys Asp Gly Gin Val Glu 
1955 1960 1965 

Phe Phe Gly Arg Met Asp His Gin Val Lys Val Arg Gly His Arg lie 

5 1970 1975 1980 

Glu Leu Ala Glu Val Glu His Ala Leu Leu Ser Ser Ala Gly Val His 
1985 1990 1995 ~ 2000 

Asp Ala Val Val Val Ser Asn Ser Gin Glu Asp Asn Gin Gly Val Glu 
10 2005 2010 2015 

Met Val Ala Phe lie Thr Ala Gin Asp Asn Glu Thr Leu Gin Glu Ala 

2020 2025 2030 

Gin Ser Ser Asn Gin Val Gin Glu Trp Glu Ser His Phe Glu Thr Thr 
15 2035 2040 2045 

Ala Tyr Ala Asp lie Thr Ala lie Asp Gin Asn Thr Leu Gly Arg Asp 
2050 2055 2060 

Phe Thr Ser Trp Thr Ser Met Tyr Asp Gly Thr Leu lie Asp Lys Arg 
20 2065 2070 2075 ~ 2080 

Glu Met Gin Glu Trp Leu Asp Asp Thr Met Arg Thr Phe Leu Asp Gly 

2085 2090 * 2095 

Gin Ala Ala Gly His Val Leu Glu lie Gly Thr Gly Thr Gly Met Val 
25 2100 2105 2110 

Leu Phe Asn Leu Gly Gin Ala Gly Leu Lys Ser Tyr lie Gly Leu Glu 
2115 2120 2125 

Pro Ser Gin Ser Ala Val Gin Phe Val Asn Lys Ala Ala Gin Thr Phe 

6 2130 2135 2140 

Pro Gly Leu Glu Gly Lys Ala Gin Val His Val Gly Thr Ala Met Asp 
2145 2150 2155 2160 

Thr Gly Arg Leu Ser Ala Leu Ser Pro Asp Leu lie Val lie Asn Ser 

2165 2170 2175 

3d 

Val Ala Gin Tyr Phe Pro Ser Arg Glu Tyr Leu Ala Glu Val Val Glu 

2180 2185 2190 

Ala Leu Val Arg He Pro Gly Val Arg Arg He Phe Phe Gly Asp Met 
2195 2200 2205 

40 

Arg Thr Tyr Ala Thr His Lys Asp Phe Leu Val Ala Arg Ala Val His 
2210 2215 2220 

Thr Asn Gly Ser Lys Val Thr Arg Ser Lys Val Gin Gin Glu Val Ala 
2225 2230 2235 2240 

45 

Arg Leu Glu Glu Leu Glu Glu Glu Leu Leu Val Asp Pro Ala Phe Phe 

2245 2250 2255 

Thr Ser Leu Lys Glu Ser Leu Ser Glu Glu He Glu His Val Glu He 

2260 2265 2270 

50 

Leu Pro Lys Asn Met Lys Val Asn Asn Glu Leu Ser Ser Tyr Arg Tyr 
2275 2280 2285 

Gly Ala Val Leu His He Arg Asn His Asn Gin Asn Gin Ser Arg Ser 
2290 2295 2300 

55 

He His Lys He Asn Ala Glu Ser Trp He Asp Ph Ala Ser Ser Gin 
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2305 2310 2315 2320 

Met Asp Arg Gin Gly Leu Ala Arg Leu Leu Lys Glu Asn Lys Asp Ala 

2325 2330 2335 

5 Glu Ser He Ala Val Phe Asn He Pro Tyr Ser Lys Thr He Val Glu 

2340 2345 2350 

Arg His lie Ala Lys Ser Leu Ala Asp Asp His Asp Gly Asp Asp Thr 
2355 2360 2365 

10 His f e * Ser Ile G1 V Va l Ala Trp He Ser Ala Ala Arg Glu Lys 

2370 2375 2380 

Ala Ser Gin Cys Pro Ser Leu Asp Val His Asp Leu Val Gin Leu Ala 
2385 2390 2395 2400 

15 Glu Asp Ala Gly Phe Arg Val Glu Val Ser Trp Ala Arg Gin Arg Ser 

2405 2410 2415 

Gin Asn Gly Ala Leu Asp Val Phe Phe His His Phe Gin Pro Thr Glu 

2420 2425 2430 

20 Asn Glu Ser Arg Ala Leu Val Asp Phe Pro Thr Asp Tyr Lys Gly Gin 

2435 2440 2445 

Gln ^cn Arg Ser LeU Thr Asn Arq Pro Leu Sin Arg Val Glu Ser Arg 
^ 4:>u 2455 2460 

25 o^f= Ile G1U Ala Gln Val *** Glu Gln Leu G l n Va l Leu Leu Pro Ala 

2465 2470 2475 2480 

Tyr Met Ile Pro Ala Arg He Val Val Leu Gln Asn Met Pro Leu Asn 

2485 2490 2495 

30 Thr Ser Gly Lys Val Asp Arg Lys Glu Leu Thr Leu Arg Ala Lys Val 

2500 2505 2510 

Thr Ala Ala Arg Thr Pro Ser Ser Glu Leu Val Ala Pro Arg Asp Ser 
2515 2520 2525 

35 Ile G co n Ala Ile Ile Cy3 L* 3 Glu Phe Asp Val Leu Gly Val Glu 

2535 2540 

5ti<; Gly 116 Thr ASP Asn Pne Phe Asn Val G1 Y G ly His Ser Leu Leu 
^ 545 2550 2555 2560 

40 Ala Thr Lys Leu AXa Ala Ar 9 Leu Ser Arg Gln Leu Asn Ala Gln He 

2565 2570 2575 

Ala val Lys Asp He Phe Asp Arg Pro Val He Ala Asp Leu Ala Ala 

2580 2585 2590 

Thr Ile Gln Gln Asp Thr Thr Glu His Asn Pro lie Leu Pro Thr Ser 
48 2595 2600 2605 

?Mn GlY Pr ° Val G1U S? Ser Phe Ala Gln G1 y Ar 9 Leu Trp Phe 
26X0 2615 2620 

Leu Asp Gln Leu Asn Val Gly Ala Thr Trp Tyr Leu Met Pro Phe Ala 
50 2625 2630 2635 2640 

Val Arg Leu Arg Gly Pro Leu Val Val Ser Ala Leu Ala Ala Ala Leu 

2645 2650 2655 

Leu Ala Leu Glu Glu Arg His Glu Thr Leu Arg Thr Thr Phe Ile Glu 
55 2660 2665 2670 



48 



EP0 578 616 A2 



Gin Glu Gly He Gly Met Gin Val He His Pro Phe Ala Pro Lys Glu 
2675 2680 2685 

Leu Arg Val He Asp Val Ser Gly Glu Glu Glu Ser Thr He Gin Lys 
2690 2695 2700 

He Leu Glu Lys Glu Gin Thr Thr Pro Phe Asn Leu Ala Ser Glu Pro 
2705 2710 2715 2720 

Gly Phe Arg Leu Ala Leu Leu Lys Thr Gly Glu Asp Glu His He Leu 

2725 2730 2735 

Ser Thr Val Met His His Ala He Ser Asp Gly Trp Ser Val Asp He 

2740 2745 2750 

Phe Gin Gin Glu He Gly Gin Phe Tyr Ser Ala He Leu Arg Gly His 
2755 2760 2765 

Asp Pro Leu Ala Gin He Ala Pro Leu Ser He Gin Tyr Arg Asp Phe 
2770 2775 2780 

Ala Thr Trp Gin Arg Gin He Phe Gin Val Ala Glu His Arg Arg Gin 
2785 2790 2795 2800 

Leu Ala Tyr Trp Thr Lys Gin Leu Ala Asp Asn Lys Pro Ala Glu Leu 

2805 2810 2815 

Leu Thr Asp Phe Lys Arg Pro Pro Met Leu Ser Gly Arg Ala Gly Glu 

2820 2825 2830 

He Pro Val Val Val Asp Gly Leu He Tyr Glu Lys Leu Gin Asp Phe 
2835 2840 2845 

Cys Arg He Arg Gin Val Thr Ala Phe Thr Val Leu Leu Ala Ala Phe 
2850 2855 2860 

Arg Ala Ala His Tyr Arg Met Thr Gly Thr Glu Asp Ala Thr He Gly 
2865 2870 2875 2880 

Thr Pro He Ala Asn Arg Asn Arg Pro Glu Leu Glu Gly Leu He Gly 

2885 2890 ~ 2895 

Phe Phe Val Asn Thr Gin Cys Met Arg He Thr Val Asp Val Glu Asp 

2900 2905 2910 

Ser Phe Glu Thr Leu Val His Gin Val Arg Glu Thr Thr Leu Ala Ala 
2915 2920 2925 

His Ala Asn Gin Asp Val Pro Phe Glu Gin He Val Ser Asn He Leu 
2930 2935 2940 

Pro Gly Ser Ser Asp Thr Ser Arg Asn Pro Leu Val Gin Leu Met Phe 
2945 2950 2955 2960 

Ala Leu His Ser Gin Gin Asn Leu Gly Lys Val Arg Leu Glu Gly He 

2965 2970 2975 

Glu Glu Glu He He Ser He Ala Glu Thr Thr Arg Phe Asp He Glu 

2980 2985 2990 

Phe His Leu Tyr Gin Glu Ala Glu Arg Leu Asn Gly Ser He Val Tyr 
2995 3000 3005 

Ala Ala Asp Leu Phe Val Pro Glu Thr He Gin Ser Val He Thr He 
3010 3015 3020 
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Phe Gin Gly lie Leu Gin Lys Gly Leu Gly Glu Pro Asp Met Pro Val 
3025 3030 3035 3040 

Ala Ser Met Ala Leu Asp Gly Gly Leu Glu Ser Leu Arg Ser Thr Gly 

3045 3050 3055 

Leu Leu His Pro Gin Gin Thr Asp Tyr Pro Cys Asp Ala Ser Val Val 

3060 3065 3070 

Gin lie Phe Lys Gin Gin Val Ala Val Asn Pro Asp Val lie Ala Val 
3075 3080 3085 

Arg Asp Glu Ser Thr Arg Leu Ser Tyr Ala Asp Leu Asp Arg Lys Ser 
3090 3095 3100 

Asp Gin Val Ala Cys Trp Leu Ser Arg Arg Gly lie Ala Pro Glu Thr 
3105 3110 3115 3120 

Phe Val Ala lie Leu Ala Pro Arg Ser Cys Glu Thr lie Val Ala He 

3125 3130 3135 

Leu Gly Val Leu Lys Ala Asn Leu Ala Tyr Leu Pro Leu Asp Val Asn 

3140 3145 3150 

Val Pro Ala Ser Arg Leu Glu Ala He Leu Ser Glu Val Ser Gly Ser 
3155 3160 3165 

Met Leu Val Leu Val Gly Ala Glu Thr Pro He Pro Glu Gly Met Ala 
3170 3175 3180 

Glu Ala Glu Thr He Arg He Thr Glu He Leu Ala Asp Ala Lys Thr 
3185 3190 3195 " 3200 

Asp Asp He Asn Gly Leu Ala Ala Ser Gin Pro Thr Ala Ala Ser Leu 

3205 3210 3215 

Ala Tyr Val He Phe Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val 

3220 3225 3230 

Met Val Glu His Arg Gly He Val Arg Leu Thr Lys Gin Thr Asn He 
3235 3240 3245 

Thr Ser Lys Leu Pro Glu Ser Phe His Met Ala His He Ser Asn Leu 
3250 3255 3260 

Ala Phe Asp Ala Ser Val Trp Glu Val Phe Thr Thr Leu Leu Asn Gly 
3265 3270 * 3275 3280 

Gly Thr Leu Val Cys He Asp Tyr Phe Thr Leu Leu Glu Ser Thr Ala 

3285 3290 3295 

Leu Glu Lys Val Phe Phe Asp Gin Arg Val Asn Val Ala Leu Leu Pro 

3300 3305 3310 

Pro Ala Leu Leu Lys Gin Cys Leu Asp Asn Ser Pro Ala Leu Val Lys 
3315 3320 3325 

Thr Leu Ser Val Leu Tyr He Gly Gly Asp Arg Leu Asp Ala Ser Asp 
3330 3335 3340 

Ala Ala Lys Ala Arg Gly Leu Val Gin Thr Gin Ala Phe Asn Ala Tyr 
3345 3350 3355 3360 

Gly Pro Thr Glu Asn Thr Val Met Ser Thr He Tyr Pro He Ala Glu 

3365 3370 3375 

Asp Pro Phe He Asn Gly Val Pro He Gly His Ala Val Ser Asn Ser 
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10 



15 



3380 3385 3390 

Gly Ala Phe Val Met Asp Gin Asn Gin Gin He Thr Pro Pro Gly Ala 
3395 3400 3405 

Met Gly Glu Leu He Val Thr Gly Asp Gly Leu Ala Arg Gly Tyr Thr 
3410 3415 ^ 3420 

Thr Ser Ser Leu Asn Thr Gly Arg Phe He Asn Val Asp He Asp Gly 
3425 3430 3435 3440 

Glu Gin Val Arg Ala Tyr Arg Thr Gly Asp Arg Val Arg Tyr Arg Pro 

3445 3450 3455 

Lys Asp Leu Gin He Glu Phe Phe Gly Arg He Asp His Gin Val Lys 

3460 3465 3470 

He Arg Gly His Arg He Glu Pro Ala Glu Val Glu Tyr Ala Leu Leu 
3475 3480 3485 

Ser His Asp Leu Val Thr Asp Ala Ala Val Val Thr His Ser Gin Glu 
3490 3495 3500 

20 Asn Gin Asp Leu Glu Met Val Gly Phe Val Ala Ala Arg Val Ala Asp 

3505 3510 3515 3520 

Val Arg Glu Asp Glu Ser Ser Asn Gin Val Gin Glu Trp Gin Thr His 

3525 3530 3535 

25 Phe Asp Ser He Ala Tyr Ala Asp He Thr Thr He Asp Gin Gin Ser 

3540 3545 3550 

Leu Gly Arg Asp Phe Met Ser Trp Thr Ser Met Tyr Asp Gly Ser Leu 
3555 3560 3565 

30 lie Lys Lys Ser Gin Met Gin Glu Trp Leu Asp Asp Thr Met Aro; Ser 

3570 3575 3580 

Leu Leu Asp Ser Gin Pro Pro Gly His Val Leu Glu Val Gly Thr Gly 
3585 3590 3595 3600 



35 



40 



Thr Gly Met Val Leu Phe Asn Leu Gly Arg Glu Gly Gly Leu Gin Ser 

3605 3610 3615 

Tyr Val Gly Leu Glu Pro Ser Pro Ser Ala Thr Ala Phe Val Asn Lys 

3620 3625 3630 

Ala Ala Lys Ser Phe Pro Gly Leu Glu Asp Arg He Arg Val Glu Val 
3635 3640 * 3645 

Gly Thr Ala Thr Asp He Asp Arg Leu Gly Asp Asp Leu His Ala Gly 
3650 3655 3660 

Leu Val Val Val Asn Ser Val Ala Gin Tyr Phe Pro Ser Gin Asp Tyr 
45 3665 3670 3675 3680 

Leu Ala Gin Leu Val Arg Asp Leu Thr Lys Val Pro Gly Val Glu Arg 

3685 3690 3695 

He Phe Phe Gly Asp Met Arg Ser His Ala He Asn Arg Asp Phe Leu 
50 3700 3705 3710 

Val Ala Arg Ala Val His Ala Leu Gly Asp Lys Ala Thr Lys Ala Glu 
3715 3720 3725 

He Gin Arg Glu Val Val Arg Met Glu Glu Ser Glu Asp Glu Leu Leu 
55 3730 3735 3740 
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Val Asp Pro Ala Phe Phe Thr Ser Leu Thr Thr Gin Val Glu Asn lie 
3745 3750 3755 3760 

Lys His Val Glu lie Leu Pro Lys Arg Met Arg Ala Thr Asn Glu Leu 

3765 3770 3775 

Ser Ser Tyr Arg Tyr Ala Ala Val Leu His Val Asn Asp Leu Ala Lys 

3780 3785 3790 

Pro Ala His Lys Val Ser Pro Gly Ala Trp Val Asp Phe Ala Ala Thr 
3795 3800 3805 

Lys Met Asp Arg Asp Ala Leu lie Arg Leu Leu Arg Gly Thr Lys lie 
3810 " 3815 3820 

Ser Asp His lie Ala lie Ala Asn lie Pro Asn Ser Lys Thr lie Val 
3825 3830 3835 3840 

Glu Arg Thr lie Cys Glu Ser Val Tyr Asp Leu Gly Gly Asp Ala Lys 

3845 " 3850 3855 

Asp Ser Asn Asp Arg Val Ser Trp Leu Ser Ala Ala Arg Ser Asn Ala 

3860 3865 3870 

Val Lys Val Ala Ser Leu Ser Ala lie Asp Leu Val Asp lie Ala Gin 
3875 3880 3885 

Glu Ala Gly Phe Arg Val Glu lie Ser Cys Ala Arg Gin Trp Ser Gin 
3890 3895 3900 

Asn Gly Ala Leu Asp Ala Val Phe His His Leu Gly Pro Ser Pro Gin 
3905 3910 3915 3920 

Ser Ser His Val Leu lie Asp Phe Leu Thr Asp His Gin Gly Arg Pro 

3925 3930 3935 

Glu Glu Ala Leu Thr Asn His Pro Leu His Arg Ala Gin Ser Arg Arg 

3940 3945 3950 

Val Glu Arg Gin lie Arg Glu Arg Leu Gin Thr Leu Leu Pro Ala Tyr 
3955 ^ 3960 3965 

Met lie Pro Ala Gin lie Met Val Leu Asp Lys Leu Pro Leu Asn Ala 
3970 3975 3980 

Asn Gly Lys Val Asp Arg Lys Gin Leu Thr Gin Arg Ala Gin Thr Val 
3985 3990 3995 4000 

Pro Lys Ala Lys Gin Val Ser Ala Pro Val Ala Pro Arg Thr Glu lie 

4005 4010 4015 

Glu Arg Val Leu Cys Gin Glu Phe Ser Asp Val Leu Gly Val Asp lie 

4020 4025 4030 

Gly He Met Glu Asn Phe Phe Asp Leu Gly Gly His Ser Leu Met Ala 
4035 4040 4045 

Thr Lys Leu Ala Ala Arg lie Ser Arg Arg Leu Glu Thr His Val Ser 
4050 4055 4060 

Val Lys Glu He Phe Asp His Pro Arg Val Cys Asp Leu Val Leu He 
4065 4070 4075 4080 

Val Gin Gin Gly Ser Ala Pro His Asp Pro lie Val Ser Thr Lys Tyr 

4085 4090 4095 
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10 



15 



20 



25 



Thr Gly Pro Val Pro Gin Ser Phe Ala Gin Gly Arg Leu Trp Phe Leu 

4100 4105 4110 

Asp Gin Leu Aan Phe Gly Ala Thr Trp Tyr Leu Met Pro Leu Ala Val 
4115 4120 4125 

Arg Leu Arg Gly Ala Met Asn Val His Ala Leu Thr Ala Ala Leu Leu 
4130 4135 4140 

Ala Leu Glu Arg Arg His Glu Leu Leu Arg Thr Thr Phe Tyr Glu Gin 
4145 4150 4155 4160 

Asn Gly Val Gly Met Gin Lys Val Asn Pro Val Val Thr Glu Thr Leu 

4165 4170 4175 

Arg He He Asp Leu Ser Asn Gly Asp Gly Asp Tyr Leu Pro Thr Leu 

4180 4185 ~ ~ 4190 

Lys Lys Glu Gin Thr Ala Pro Phe His Leu Glu Thr Glu Pro Gly Trp 
4195 4200 4205 

Arg Val Ala Leu Leu Arg Leu Gly Pro Gly Asp Tyr He Leu Ser Val 
4210 4215 4220 

Val Met His His He He Ser Asp Gly Trp Ser Val Asp Val Leu Phe 
4225 4230 " 4235 4240 

Gin Glu Leu Gly Gin Phe Tyr Ser Thr Ala Val Lys Gly His Asp Pro 

4245 4250 4255 

Leu Ser Gin Thr Thr Pro Leu Pro He His Tyr Arg Asp Phe Ala Leu 

4260 4265 - 4270 

Trp Gin Lys Lys Pro Thr Gin Glu Ser Glu His Glu Arg Gin Leu Gin 
4275 4280 4285 

Tyr Trp Val Glu Gin Leu Val Asp Ser Ala Pro Ala Glu Leu Leu Thr 
4290 4295 4300 

Asp Leu Pro Arg Pro Ser He Leu Ser Gly Gin Ala Gly Glu Met Ser 
4305 4310 4315 4320 

Val Thr He Glu Gly Ala Leu Tyr Lys Asn Leu Glu Glu Phe Cys Arg 

4325 4330 4335 

Val His Arg Val Thr Ser Phe Val Val Leu Leu Ala Ala Leu Arg Ala 

4340 4345 4350 

Ala His Tyr Arg Leu Thr Gly Ser Glu Asp Ala Thr He Gly Thr Pro 
4355 4360 4365 

He Ala Asn Arg Asn Arg Pro Glu Leu Glu Gin He He Gly Phe Phe 
4370 4375 4380 

Val Asn Thr Gin Cys He Arg He Thr Val Asn Glu Asp Glu Thr Phe 
4385 4390 4395 4400 

Glu Ser Leu Val Gin Gin Val Arg Ser Thr Ala Thr Ala Ala Phe Ala 

4405 4410 4415 

Hi3 Gin Asp Val Pro Phe Glu Lys He Val Ser Thr Leu Leu Pro Gly 

4420 4425 4430 

Ser Arg Asp Ala Ser Arg Asn Pro Leu Val Gin Leu Met Phe Ala Val 
4435 4440 4445 

55 His Ser Gin Lys Asn Leu Gly Glu Leu Lys Leu Glu Asn Ala His Ser 
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4450 4455 4460 

Glu Val Val Pro Thr Glu lie Thr Thr Arg Phe Asp Leu Glu Phe His 
4465 4470 4475 4480 

Leu Phe Gin Gin Asp Asp Lys Leu Glu Gly Ser lie Leu Tyr Ser Thr 

4485 4490 4495 

Asp Leu Phe Glu Ala Val Ser Val Gin Ser Leu Leu Ser Val Phe Gin 

4500 4505 4510 

Glu lie Leu Arg Arg Gly Leu Asn Gly Pro Asp Val Pro lie Ser Thr 
4515 4520 4525 

Leu Pro Leu Gin Asp Gly lie Val Asp Leu Gin Arg Gin Gly Leu Leu 
4530 4535 ~ 4540 

Asp Val Gin Lys Thr Glu Tyr Pro Arg Asp Ser Ser Val Val Asp Val 
4545 4550 4555 4560 

Phe His Glu Gin Val Ser He Asn Pro Asp Ser He Ala Leu He His 

4565 4570 4575 

Gly Ser Glu Lys Leu Ser Tyr Ala Gin Leu Asp Arg Glu Ser Asp Arg 

4580 4585 " 4590 

Val Ala Arg Trp Leu Arg His Arg Ser Phe Ser Ser Asp Thr Leu He 
4595 4600 4605 

Ala Val Leu Ala Pro Arg Ser Cys Glu Thr He He Ala Phe Leu Gly 
4610 4615 4620 

He Leu Lys Ala Asn Leu Ala Tyr Leu Pro Leu Asp Val Lys Ala Pro 
4625 4630 4635 4640 

Ala Ala Arg He Asp Ala He Val Ser Ser Leu Pro Gly Asn Lys Leu 

4645 4650 4655 

He Leu Leu Gly Ala Asn Val Thr Pro Pro Lys Leu Gin Glu Ala Ala 

4660 4665 4670 

He Asp Phe Val Pro lie Arg Asp Thr Phe Thr Thr Leu Thr Asp Gly 
4675 4680 4685 

Thr Leu Gin Asp Gly Pro Thr He Glu Arg Pro Ser Ala Gin Ser Leu 
4690 4695 4700 

Ala Tyr Ala Met Phe Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val 
4705 4710 4715 4720 

Met Val Gin His Arg Asn He Val Arg Leu Val Lys Asn Ser Asn Val 

4725 4730 4735 

Val Ala Lys Gin Pro Ala Ala Ala Arg He Ala His lie Ser Asn Leu 

4740 4745 4750 

Ala Phe Asp Ala Ser Ser Trp Glu He Tyr Ala Pro Leu Leu Asn Gly 
4755 4760 4765 

50 Gly Ala lie Val Cys Ala Asp Tyr Phe Thr Thr lie Asp Pro Gin Ala 

4770 4775 4780 

Leu Gin Glu Thr Phe Gin Glu His Glu lie Arg Gly Ala Met Leu Pro 
4785 4790 4795 4800 

55 Pro Ser Leu Leu Lys Gin Cys Leu Val Gin Ala Pro Asp Met He Ser 

4805 4810 4815 
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Arg Leu Asp lie Leu Phe Ala Ala Gly Asp Arg Phe Ser Ser Val Asp 

4820 4825 4830 

5 Ala Leu Gin Ala Gin Arg Leu Val Gly Ser Gly Val Phe Asn Ala Tyr 

4835 4840 4845 

Gly Pro Thr Glu Asn Thr lie Leu Ser Thr lie Tyr Asn Val Ala Glu 
4850 4855 4860 

70 Asn Asp Ser Phe Val Asn Gly Val Pro lie Gly Ser Ala Val Ser Asn 

4865 4870 4875 4880 

Ser Gly Ala Tyr lie Met Asp Lys Asn Gin Gin Leu Val Pro Ala Gly 

4885 4890 4895 
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Val Met Gly Glu Leu Val Val Thr Gly Asp Gly Leu Ala Arg Gly Tyr 

4900 4905 4910 

Met Asp Pro Lys Leu Asp Ala Asp Arg Phe lie Gin Leu Thr Val Asn 
4915 4920 4925 

Gly Ser Glu Gin Val Arg Ala Tyr Arg Thr Gly Asp Arg Val Arg Tyr 
4930 4935 4940 

Arg Pro Lys Asp Phe Gin lie Glu Phe Phe Gly Arg Met Asp Gin Gin 
4945 4950 4955 4960 

He Lys He Arg Gly His Arg He Glu Pro Ala Glu Val Glu Gin Ala 
25 4965 4970 4975 

Phe Leu Asn Asp Gly Phe Val Glu Asp Val Ala He Val He Arg Thr 

4980 4985 4990 

Pro Glu Asn Gin Glu Pro Glu Met Val Ala Phe Val Thr Ala Lys Gly 
30 4995 5000 5005 

Asp Asn Ser Ala Arg Glu Glu Glu Ala Thr Thr Gin He Glu Gly Trp 
5010 5015 5020 

Glu Ala His Phe Glu Gly Gly Ala Tyr Ala Asn He Glu Glu He Glu 
35 5025 5030 5035 5040 

Ser Glu Ala Leu Gly Tyr Asp Phe Met Gly Trp Thr Ser Met Tyr Asp 

5045 5050 5055 

Gly Thr Glu He Asp Lys Asp Glu Met Arg Glu Trp Leu Asn Asp Thr 
40 5060 5065 5070 

Met Arg Ser Leu Leu Asp Gly Lys Pro Ala Gly Arg Val Leu Glu Val 
5075 5080 5085 

Gly Thr Gly Thr Gly Met He Met Phe Asn Leu Gly Arg Ser Gin Gly 
45 5090 5095 5100 

Leu Glu Arg Tyr He Gly Leu Glu Pro Ala Pro Ser Ala Ala Glu Phe 
5105 5110 5115 5120 
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Val Asn Asn Ala Ala Lys Ser Phe Pro Gly Leu Ala Gly Arg Ala Glu 

5125 5130 5135 

Val His Val Gly Thr Ala Ala Asp Val Gly Thr Leu Gin Gly Leu Thr 

5140 5145 5150 

Ser Asp Met Ala Val He Asn Ser Val Ala Gin Tyr Phe Pro Thr Pro 
5155 5160 5165 
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5525 5530 5535 

Ser Leu Met Ala Thr Lys Leu Ala Ala Arg Leu Gly Arg Gin Leu Asn 

5540 5545 5550 

Thr Arg lie Ser Val Arg Asp Val Phe Asp Gin Pro val Val Ala Asp 
5555 5560 5565 

Leu Ala Ala val lie Gin Arg Asn Ser Ala Pro His Glu Pro lie Lys 
5570 5575 5580 

Pro Ala Asp Tyr Thr Gly Pro Val Pro Gin Ser Phe Ala Gin Gly Arg 
5585 5590 5595 5600 

Leu Trp Phe Leu Asp Gin Leu Asn Val Gly Ala Thr Trp Tyr Leu Met 

5605 5610 5615 

Pro Leu Gly He Arg Leu His Gly Ser Leu Arg Val Asp Ala Leu Ala 

5620 5625 5630 

Thr Ala He Ser Ala Leu Glu Gin Arg His Glu Pro Leu Arg Thr Thr 
5635 5640 5645 

Phe His Glu Glu Asp Gly Val Gly Val Gin Val Val Gin Asp His Arg 
5650 5655 5660 

Pro Lys Asp Leu Arg lie He Asp Leu Ser Thr Gin Pro Lys Asp Ala 
5665 5670 5675 5680 

Tyr Leu Ala Val Leu Lys His Glu Gin Thr Thr Leu Phe Asp Leu Ala 

5685 5690 5695 

Thr Glu Pro Gly Trp Arg Val Ala Leu He Arg Leu Gly Glu Glu Glu 

5700 5705 5710 

His He Leu Ser He Val Met His His He He Ser Asp Gly Trp Ser 
5715 5720 5725 

Val Glu Val Leu Phe Asp Glu Met His Arg Phe Tyr Ser Ser Ala Leu 
5730 5735 5740 

Arg Gin Gin Asp Pro Met Glu Gin He Leu Pro Leu Pro He Gin Tyr 
5745 5750 5755 5760 

Arg Asp Phe Ala Ala Trp Gin Lys Thr Glu Glu Gin Val Ala Glu His 

5765 5770 5775 

Gin Arg Gin Leu Asp Tyr Trp Thr Glu His Leu Ala Asp Ser Thr Pro 

5780 5785 5790 

Ala Glu Leu Leu Thr Asp Leu Pro Arg Pro Ser He Leu Ser Gly Arg 
5795 5800 5805 

Ala Asn Glu Leu Pro Leu Thr He Glu Gly Arg Leu His Asp Lys Leu 
5810 5815 5820 

Arg Ala Phe Cys Arg Val His Gin Ala Thr Pro Phe Val He Leu Leu 
5825 5830 5835 5840 

Ala Ala Leu Arg Ala Ala His Tyr Arg Leu Thr Gly Ala Glu Asp Ala 

5845 5850 5855 

Thr Leu Gly Thr Pro He Ala Asn Arg Asn Arg Pro Glu Leu Glu Asn 

5860 5865 5870 

Met He Gly Phe Phe Val Asn Thr Gin Cys Met Arg He Ala He Glu 
5875 5880 5885 



57 



EP0 578 616 A2 



10 



Glu Asn Asp Asn Phe Glu Ser Leu Val Arg Arg Val Arg Ser Thr Ala 
5890 5895 5900 

Thr Ser Ala Phe Ala Asn Gin Asp Val Pro Phe Glu Ser He Val Ser 
5905 5910 5915 5920 

Ser Leu Leu Pro Gly Ser Arg Asp Ala Ser Arg Asn Pro Leu Val Gin 

5925 5930 5935 

Val lie Leu Ala Val His Ser Gin Gin Asp Leu Gly Lys Leu Thr Leu 

5940 5945 5950 

Glu Gly Leu Arg Asp Glu Ala Val Asp Ser Ala He Ser Thr Arg Phe 
5955 5960 5965 

Asp Val Glu Phe His Leu Phe Glu His Ala Asp Arg Leu Ser Gly Ser 
15 5970 5975 5980 

Val Leu Tyr Ala Lys Glu Leu Phe Lys Leu Arg Thr He Glu Ser Val 
5985 5990 5995 6000 

Val Ser Val Phe Leu Glu Thr Leu Arg Arg Ala Leu Asp Gin Pro Leu 
20 6005 6010 6015 

Thr Pro Leu Ala Val Leu Pro Leu Thr Asp Gly Val Gly Glu He Ala 

6020 6025 6030 

Ser Lys Gly Leu Leu Asp Val Pro Arg Thr Asp Tyr Pro Arg Asp Ala 
25 6035 6040 6045 

Asn He Val Glu Val Phe Gin Gin His Val Arg Ala Thr Pro Asp Ala 
6050 6055 6060 

He Ala Val Lys Asp Ala Thr Ser He Leu Thr Tyr Ala Gin Leu Asp 
30 6065 6070 6075 6080 

Gin Gin Ser Asp Arg Leu Ala He Trp Leu Ser Arg Arg His Met Met 

6085 6090 ~ 6095 

Pro Glu Thr Leu Val Gly Val Leu Ala Pro Arg Ser Cys Glu Thr He 
35 6100 6105 6110 

He Ala Met Phe Gly He Met Lys Ala Asn Leu Ala Tyr Leu Pro Leu 
6115 6120 6125 



40 



45 



50 



55 



Asp He Asn Ser Pro Ala Ala Arg Leu Arg Ser He Leu Ser Ala Val 
6130 6135 6140 

Asp Gly Asn Lys Leu Val Leu Leu Gly Ser Gly Val Thr Ala Pro Glu 
6145 6150 6155 6160 

Gin Glu Asn Pro Glu Val Glu Ala Val Gly He Gin Glu He Leu Ala 

6165 6170 6175 

Gly Thr Gly Leu Asp Lys Thr Gin Gly Ser Asn Ala Arg Pro Ser Ala 

6180 6185 6190 

Thr Ser Leu Ala Tyr Val He Phe Thr Ser Gly Ser Thr Gly Lys Pro 
6195 6200 ~ 6205 

Lys Gly Val Met Val Glu His Arg Ser Val Thr Arg Leu Ala Lys Pro 
6210 6215 6220 

Ser Asn Val He Ser Lys Leu Pro Gin Gly Ala Arg Val Ala His Leu 
6225 6230 6235 6240 
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Ala Asn lie Ala Phe Asp Ala Ser He Trp Glu He Ala Thr Thr Leu 

6245 6250 6255 

Leu Asn Gly Ala Thr Leu Val Cys Leu Asp Tyr His Thr Val Leu Asp 

6260 6265 6270 

Cys Arg Thr Leu Lys Glu Val Phe Glu Arg Glu Ser He Thr Val Val 
6275 6280 6285 

Thr Leu Met Pro Ala Leu Leu Lys Gin Cys Val Ala Glu He Pro Glu 
6290 6295 6300 

Thr Leu Ala His Leu Asp Leu Leu Tyr Thr Gly Gly Asp Arg Val Gly 
6305 6310 6315 6320 

Gly His Asp Ala Met Arg Ala Arg Ser Leu Val Lys He Gly Met Phe 

6325 6330 6335 

Ser Gly Tyr Gly Pro Thr Glu Asn Thr Val He Ser Thr He Tyr Glu 

6340 6345 6350 

Val Asp Ala Asp Glu Met Phe Val Asn Gly Val Pro He Gly Lys Thr 
6355 6360 6365 

Val Ser Asn Ser Gly Ala Tyr Val Met Asp Arg Asn Gin Gin Leu Val 
6370 6375 6380 

Pro Ser Gly Val Val Gly Glu Leu Val Val Thr Gly Asp Gly Leu Ala 
6385 6390 6395 6400 

Arg Gly Tyr Thr Asp Pro Ser Leu Asn Lys Asn Arg Phe He Tyr He 

6405 6410 " 6415 

Thr Val Asn Gly Glu Ser He Arg Ala Tyr Arg Thr Gly Asp Arg Val 

6420 6425 6430 

Arg Tyr Arg Pro His Asp Leu Gin He Glu Phe Phe Gly Arg Met Asp 
6435 6440 6445 

Gin Gin Val Lys He Arg Gly His Arg He Glu Pro Gly Glu Val Glu 
6450 6455 6460 

Ser Ala Leu Leu Ser His Asn Ser Val Gin Asp Ala Ala Val Val lie 
6465 6470 6475 6480 

Cys Ala Pro Ala Asp Gin Asp Ser Gly Ala Glu Met Val Ala Phe Val 

6485 6490 6495 

Ala Ala Arg Asn Thr Glu Asp Glu Asp Thr Gin Glu Glu Glu Ala Val 

6500 6505 6510 

Asp Gin Val Gin Gly Trp Glu Thr His Phe Glu Thr Ala Ala Tyr Ser 
6515 6520 6525 

Glu Val Lys Asp He Arg Gin Ser Glu Val Gly Asn Asp Phe Met Gly 
6530 6535 6540 

Trp Thr Ser Met Tyr Asp Gly Ser Glu He Asp Lys Thr Asp Met His 
6545 6550 6555 6560 

Glu Trp Leu Asn Asp Thr Met Arg Met He Leu Asp Ala Arg Glu Pro 

6565 6570 6575 

Gly His Val Leu Glu He Gly Thr Gly Thr Gly Met Val Met Phe Asn 

6580 6585 ~* 6590 

Leu Ala Lys Cys Pro Gly Leu Gin Gly Tyr Val Gly Phe Glu Pro Ser 
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6595 6600 6605 

Lys Ser Ala Ala Gin Phe Val Asn Asp Ala Ala Gin Ser Phe Pro Ala 
6610 6615 6620 

Leu Lys Asp Gly Arg Ser lie Val His Val Gly Thr Ala Thr Asp lie 
6625 6630 6635 6640 

Asn Lys Ala Gly Pro lie Gin Pro Arg Leu Val Val lie Asn Ser Val 

6645 6650 6655 

Ala Gin Tyr Phe Pro Thr Pro Glu Tyr Leu Phe Arg Val Val Glu Ala 

6660 6665 6670 

Leu Val Gin lie Pro Ser Val Glu Arg lie Val Phe Gly Asp Met Arg 
6675 6680 6685 

Thr Asn Ala lie Asn Arg Asp Phe Val Ala Ser Arg Ala Leu His Thr 
6690 6695 6700 

Leu Gly Glu Lys Ala Asn Lys Arg Leu Val Arg Gin Met lie Tyr Glu 
6705 6710 6715 ~ 6720 

Leu Glu Ala Asn Glu Glu Glu Leu Leu Thr Asp Pro Ala Phe Phe Thr 

6725 6730 6735 

Ser Leu Arg Thr Arg Leu Gly Glu Lys lie Lys His Val Glu lie Leu 

6740 6745 6750 

Pro Lys Thr Met Lys Ala Thr Asn Glu Leu Ser Lys Tyr Arg Tyr Ala 
6755 6760 6765 

Ala Val Leu His Val Arg Gly Ser Arg Glu Gin Ser Thr lie His Gin 
6770 6775 6780 

Val Ser Pro Asn Ala Trp lie Asp Phe Ala Ala Asp Gly Leu Asp Arg 
6785 6790 6795 6800 

Gin Thr Leu lie Asn Leu Leu Lys Glu His Lys Asp Ala Gly Thr Val 

6805 6810 6815 

Ala lie Gly Asn lie Pro Tyr Ser Lys Thr lie Val Glu Arg Phe Val 

6820 6825 6830 

Asn Lys Ser Leu Ser Glu Asp Asp Met Glu Glu Gly Gin Asn Ser Leu 
6835 6840 6845 

Asp Gly Ser Ala Trp Val Ala Ala Val Arg Met Ala Ala Gin Ser Cys 
6850 6855 6860 

Pro Ser Leu Asp Ala Met Asp Val Lys Glu He Ala Gin Glu Ala Gly 
6865 6870 6875 6880 

Tyr Gin Val Glu val Ser Trp Ala Arg Gin Trp Ser Gin Asn Gly Ala 

6885 6890 6895 

Leu Asp Ala He Phe His His Phe Glu Pro Pro Lys Glu Gly Ala Arg 

6900 6905 6910 

Thr Leu lie Glu Phe Pro Thr Asp Tyr Glu Gly Arg Asn Val Asn Thr 
6915 6920 6925 

Leu Thr Asn Arg Pro Leu Asn Ser He Gin Ser Arg Arg Leu Gly Thr 
6930 6935 6940 

Gin He Arg Glu Lys Leu Gin Thr Leu Leu Pro Pro Tyr Met He Pro 
6945 6950 6955 " 6960 
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Ser Arg lie Met Val Leu Asp Gin Met Pro Val Asn Asn Asn Gly Lys 

6965 6970 6975 

He Asp Arg Lys Glu Leu Val Arg Arg Ala He Val Ala Pro Lys Pro 

6980 6985 6990 

Arg Ser Ala Ala Thr Arg Val Ala Pro Arg Asn Glu He Glu Ala lie 
6995 7000 7005 

Leu Arg Asp Glu Phe Glu Asp Val Leu Gly Thr Glu Val Ser Val Leu 
7010 7015 " 7020 

Asp Asn Phe Phe Asp Leu Gly Gly His Ser Leu Met Ala Thr Lys Leu 
7025 7030 7035 7040 

Ala Ala Arg Val Ser Arg Arg Leu Asp Ala His He Ser He Lys Asp 

7045 7050 7055 

Val Phe Asp Gin Pro Val Leu Ala Asp Leu Ala Ala Ser He Gin Arg 

7060 7065 7070 

Glu Ser Ala Pro His Glu Pro He Pro Gin Arg Pro Tyr Thr Gly Pro 
7075 7080 7085 

Ala Glu Gin Ser Phe Ala Gin Gly Arg Leu Trp Phe Leu Asp Gin Leu 
7090 7095 7100 

Asn Leu Gly Ala Thr Trp Tyr Leu Met Pro Leu Ala He Arg He Arg 
7105 7110 7115 7120 

Gly Gin Leu Arg Val Ala Ala Leu Ser Ala Ala Leu Phe Ala Leu Glu 

7125 7130 7135 

Arg Arg His Glu Thr Leu Arg Thr Thr Phe Glu Glu Ser Asp Gly Val 

7140 7145 7150 

Gly Val Gin He val Gly Glu Ala Arg Asn Ser Asp Leu Arg Val His 
7155 7160 7165 

Asp Val Ser Thr Gly Asp Asp Gly Glu Tyr Leu Glu Val Leu Arg Arg 
7170 7175 7180 

Glu Gin Thr Val Pro Phe Asp Leu Ser Ser Glu Pro Gly Trp Arg Val 
7185 7190 7195 7200 

Cys Leu Val Lys Thr Gly Glu Glu Asp His Val Leu Ser He Val Met 

7205 7210 7215 

His His He He Tyr Asp Gly Trp Ser Val Asp He Leu Arg Gly Glu 

7220 7225 7230 

Leu Gly Gin Phe Tyr Ser Ala Ala Leu Arg Gly Gin Asp Pro Leu Leu 
7235 7240 " 7245 

His Ala Asn Pro Leu Pro He Gin Tyr Arg Asp Phe Ala Ala Trt> Gin 
7250 7255 7260 

Arg Glu Ala Lys Gin Val Glu Glu His Gin Arg Gin Leu Gly Tvr Trt> 
7265 7270 7275 7280 

Ser Lys Gin Leu Val Asp Ser Thr Pro Ala Glu Leu Leu Thr Asp Leu 

7285 7290 7295 

Pro Arg Pro Ser He Leu Ser Gly Arg Ala Gly Ser Val Asp Val Thr 

7300 7305 - 7310 
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He Glu Gly Ser Val Tyr Gly Ala Leu Gin Ser Phe Cys Arg Thr Arg 
7315 7320 7325 

Ser Val Thr Thr Phe Val Val Leu Leu Thr Val Phe Arg He Ala His 
7330 7335 7340 

Phe Arg Leu Thr Ala Val Asp Asp Ala Thr He Gly Thr Pro He Ala 
? 345 7350 7355 7360 

Asn Arg Asn Arg Pro Glu Leu Glu Thr Leu Val Gly Cys Phe Val Aan 

7365 7370 7375 

Thr Gin Cys Met Arg He Ser He Ala Asp Asp Asp Asn Phe Glu Gly 

7380 7385 * 7390 

Leu Val Arg Gin Val Arg Asn Val Ala Thr Ala Ala Tyr Ala Asn Gin 
7395 7400 7405 

Asp Val Pro Phe Glu Arg He Val Ser Ala Leu Val Pro Gly Ser Arg 
7410 7415 7420 

Asn Thr Ser Arg Asn Pro Leu Val Gin Leu Met Phe Ala Val Gin Ser 
7425 7430 7435 7440 

Val Glu Asp Tyr Asp Gin Val Arg Leu Glu Gly Leu Glu Ser Val Met 

7445 7450 7455 

Met Pro Gly Glu Ala Ser Thr Arg Phe Asp Met Glu Phe His Leu Val 

7460 7465 7470 

Pro Gly Asp Gin Lys Leu Thr Gly Ser Val Leu Tyr Ser Ser Asp Leu 
7475 7480 7485 

Phe Glu Gin Gly Thr He Gin Asn Phe Val Asp He Phe Gin Glu Cys 
7490 7495 7500 

Leu Arg Ser Val Leu Asp Gin Pro Leu Thr Pro He Ser Val Leu Pro 
7 505 7510 7515 7520 

Phe Ser Asn Ala He Ser Asn Leu Glu Ser Leu Asp Leu Leu Glu Met 

7525 7530 * 7535 

Pro Thr Ser Asp Tyr Pro Arg Asp Arg Thr Val Val Asp Leu Phe Arg 

7540 7545 7550 

Glu Gin Ala Ala He Cys Pro Asp Ser He Ala Val Lys Asp Ser Ser 
7555 7560 7565 

Ser Gin Leu Thr Tyr Ala Gin Leu Asp Glu Gin Ser Asp Arg Val Ala 
7570 7575 7580 

Ala Trp Leu His Glu Arg His Met Pro Ala Glu Ser Leu Val Glv Val 
7585 7590 7595 7600 

Leu Ser Pro Arg Ser Cys Glu Thr He He Ala Tyr Phe Gly He Met 

7605 7610 7615 

Lys Ala Asn Leu Ala Tyr Leu Pro Leu Asp Val Tyr Ala Pro Asp Ala 

7620 7625 7630 

Arg Leu Ala Ala He Leu Asp Thr Val Glu Gly Glu Arg Leu Leu Leu 
7635 7640 7645 

Leu Gly Ala Gly Val Pro Gin Pro Gly He Gin He Pro Arg Leu Ser 
7650 7655 7660 

Thr Ala Tyr He Ala Glu Ala Leu Ser His Ala Thr Thr Val Asp Val 
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7665 7670 7675 7680 

Thr Ser lie Pro Gin Pro Ser Ala Thr Ser Leu Ala Tyr Val lie Phe 

7685 7690 7695 

Thr Ser Gly Ser Thr Gly Lys Pro Lys Gly Val Met He Glu His Arg 

7700 7705 7710 

Gly He Val Arg Leu Val Arg Asp Thr Asn Val Asn Val Phe Pro Glu 
7715 7720 7725 

Ser Gly Ser Ala Leu Pro Val Ser His Phe Ser Asn Leu Ala Trp Asp 
7730 7735 7740 

Ala Ala Thr Trp Glu He Tyr Thr Ala Val Leu Asn Gly Gly Thr Val 
7745 7750 7755 * 7760 

Val Cys He Asp Arg Asp Thr Met Leu Asp He Ala Ala Leu Asn Ser 

7765 7770 7775 

Thr Phe Arg Lys Glu Asn Val Arg Ala Ala Phe Phe Thr Pro Ala Phe 

7780 7785 7790 

Leu Lys Gin Cys Leu Ala Glu Thr Pro Glu Leu Val Ala Asn Leu Glu 
7795 7800 7805 

He Leu His Thr Ala Gly Asp Arg Leu Asp Pro Gly Asp Ala Asn Leu 
7810 7815 7820 

Ala Gly Lys Thr Ala Lys Gly Gly He Phe Asn Val Leu Gly His Thr 
7825 7830 7835 7840 

Glu Asn Thr Ala Tyr Ser Thr Phe Tyr Pro Val Val Gly Glu Glu Thr 

7845 7850 7855 

Phe Val Asn Gly Val Pro Val Gly Arg Gly He Ser Asn Ser His Ala 

7860 7865 7870 

Tyr He He Asp Arg His Gin Lys Leu Val Pro Ala Gly Val Met Gly 
7875 7880 7885 

Glu Leu He Leu Thr Gly Asp Gly Val Ala Arg Gly Tyr Thr Asp Ser 
7890 7895 7900 

Ala Leu Asn Lys Asp Arg Phe Val Tyr He Asp He Asn Gly Lys Ser 
7905 7910 7915 7920 

Thr Trp Ser Tyr Arg Thr Gly Asp Lys Ala Arg Tyr Arg Pro Arg Asp 

7925 7930 7935 

Gly Gin Leu Glu Phe Phe Gly Arg Met Asp Gin Met Val Lys He Arg 

7940 7945 7950 

45 Gly Val Arg He Glu Pro Gly Glu Val Glu Leu Thr Leu Leu Asp His 

7955 7960 7965 

Lys Ser Val Leu Ala Ala Thr Val Val Val Arg Arg Pro Pro Asn Gly 
7970 7975 7980 

so Asp Pro Glu Met He Ala Phe He Thr He Asp Ala Glu Asp Asp Val 

7985 7990 7995 8000 

Gin Thr His Lys Ala He Tyr Lys His Leu Gin Gly He Leu Pro Ala 

8005 8010 8015 

^ Tyr Met He Pro Ser His Leu val He Leu Asp Gin Met Pro Val Thr 

8020 8025 8030 
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Asp Asn Gly Lys Val Asp Arg Lys Asp Leu Ala Leu Arg Ala Gin Thr 
8035 8040 8045 

Val Gin Lys Arg Arg Ser Thr Ala Ala Arg Val Pro Pro Arg Asp Glu 
5 8050 8055 8060 

Val Glu Ala Val Leu Cys Glu Glu Tyr Ser Asn Leu Leu Glu Val Glu 
8065 8070 8075 8080 

Val Gly He Thr Asp Gly Phe Phe Asp Leu Gly Gly His Ser Leu Leu 
10 8085 8090 8095 

Ala Thr Lys Leu Ala Ala Arg Leu Ser Arg Gin Leu Asn Thr Arg Val 

8100 8105 8110 

Ser Val Lys Asp Val Phe Asp Gin Pro He Leu Ala Asp Leu Ala Asp 
« 8115 8120 8125 

He He Arg Arg Gly Ser His Arg His Asp Pro He Pro Ala Thr Pro 
8130 8135 8140 

Tyr Thr Gly Pro Val Glu Gin Ser Phe Ala Gin Gly Arg Leu Trp Phe 
20 8145 8150 8155 ^ " 8160 

Leu Glu Gin Leu Asn Leu Gly Ala Ser Trp Tyr Leu Met Pro Phe Ala 

8165 8170 8175 

He Arg Met Arg Gly Pro Leu Gin Thr Lys Ala Leu Ala Val Ala Leu 
25 8180 8185 8190 

Asn Ala Leu Val His Arg His Glu Ala Leu Arg Thr Thr Phe Glu Asp 
8195 8200 8205 
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His Asp Gly Val Gly Val Gin Val He Gin Pro Lys Ser Ser Gin Asp 
8210 8215 8220 

Leu Arg He He Asp Leu Ser Asp Ala Val Asp Asp Thr Ala Tyr Leu 
8225 8230 8235 8240 

Ala Ala Leu Lys Arg Glu Gin Thr Thr Ala Phe Asp Leu Thr Ser Glu 

8245 8250 8255 

Pro Gly Trp Arg Val Ser Leu Leu Arg Leu Gly Asp Asp Asp Tyr He 

8260 8265 8270 

Leu Ser He Val Met His His He He Ser Asp Gly Trp Thr Val Asp 
8275 8280 8285 

Val Leu Arg Gin Glu Leu Gly Gin Phe Tyr Ser Ala Ala He Arg Glv 
8290 8295 8300 

Gin Glu Pro Leu Ser Gin Ala Lys Ser Leu Pro He Gin Tyr Arg Asp 
8305 8310 8315 8320 

Phe Ala Val Trp Gin Arg Gin Glu Asn Gin He Lys Glu Gin Ala Lys 

8325 8330 8335 

Gin Leu Lys Tyr Trp Ser Gin Gin Leu Ala Asp Ser Thr Pro Cy3 Glu 

8340 8345 8350 

Phe Leu Thr Asp Leu Pro Arg Pro Ser He Leu Ser Gly Glu Ala Asp 
8355 8360 8365 

Ala Val Pro Met Val He Asp Gly Thr Val Tyr Gin Leu Leu Thr Asp 
8370 8375 8380 
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Phe Cys Arg Thr His Gin Val Thr Ser Phe Ser Val Leu Leu Ala Ala 
8385 8390 8395 8400 

Phe Arg Thr Ala His Tyr Arg Leu Thr Gly Thr Leu Asp Ala Thr Val 

8405 8410 8415 

Gly Thr Pro lie Ala Asn Arg Asn Arg Pro Glu Leu Glu Gly Leu lie 

8420 8425 8430 

Gly Phe Phe Val Asn Thr Gin Cys Met Arg Met Ala lie Ser Glu Thr 
8435 8440 8445 

Glu Thr Phe Glu Ser Leu Val Gin Gin Val Arg Leu Thr Thr Thr Glu 
8450 8455 8460 

Ala Phe Ala Asn Gin Asp Val Pro Phe Glu Gin lie Val Ser Thr Leu 
15 8465 8470 8475 8480 

Leu Pro Gly Ser Arg Asp Thr Ser Arg Asn Pro Leu Val Gin Val Met 

8485 8490 8495 

Phe Ala Leu Gin Ser Gin Gin Asp Leu Gly Arg lie Gin Leu Glu Gly 
20 8500 8505 8510 

Met Thr Asp Glu Ala Leu Glu Thr Pro Leu Ser Thr Arg Leu Asp Leu 
8515 8520 8525 

Glu Val His Leu Phe Gin Glu Val Gly Lys Leu Ser Gly Ser Leu Leu 
25 8530 8535 8540 

Tyr Ser Thr Asp Leu Phe Glu Val Glu Thr He Arg Gly He Val Asp 
8545 8550 8555 8560 

Val Phe Leu Glu lie Leu Arg Arg Gly Leu Glu Gin Pro Lys Gin Arg 
30 8565 8570 ^ 8575 

Leu Met Ala Met Pro He Thr Asp Gly He Thr Lys Leu Arg Asp Gin 

8580 8585 8590 
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Gly Leu Leu Thr Val Ala Lys Pro Ala Tyr Pro Arg Glu Ser Ser Val 
8595 8600 8605 

He Asp Leu Phe Arg Gin Gin Val Ala Ala Ala Pro Asp Ala He Ala 
8610 8615 8620 

Val Trp Asp Ser Ser Ser Thr Leu Thr Tyr Ala Asp Leu Asp Gly Gin 
8625 8630 8635 * 8640 

Ser Asn Lys Leu Ala His Trp Leu Cys Gin Arg Asn Met Ala Pro Glu 

8645 8650 8655 

Thr Leu Val Ala Val Phe Ala Pro Arg Ser Cys Leu Thr He Val Ala 

8660 8665 8670 

Phe Leu Gly Val Leu Lys Ala Asn Leu Ala Tyr Leu Pro Leu Asp Val 
8675 8680 8685 

Asn Ala Pro Ala Ala Arg He Glu Ala He Leu Ser Ala Val Pro Glv 
8690 8695 8700 

His Lys Leu Val Leu Val Gin Ala His Gly Pro Glu Leu Gly Leu Thr 
8705 8710 8715 8720 

Met Ala Asp Thr Glu Leu Val Gin He Asp Glu Ala Leu Ala Ser Ser 

8725 8730 8735 

Ser Ser Gly Asp His Glu Gin He His Ala Ser Gly Pro Thr Ala Thr 
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8740 8745 8750 

Ser Leu Ala Tyr Val Met Phe Thr Ser Gly Ser Thr Gly Lys Pro Lys 
8755 8760 8765 

Gly Val Met He Asp His Arg Ser He He Arg Leu Val Lys Asn Ser 
8770 8775 8780 

Asp Val Val Ala Thr Leu Pro Thr Pro Val Arg Met Ala Asn Val Ser 
8785 8790 8795 8800 

Asn Leu Ala Phe Asp He Ser Val Gin Glu He Tyr Thr Ala Leu Leu 

8805 8810 8815 

Asn Gly Gly Thr Leu Val Cys Leu Asp Tyr Leu Thr Leu Leu Asp Ser 

8820 8825 8830 

Lys He Leu Tyr Asn Val Phe Val Glu Ala Gin Val Asn Ala Ala Met 
8835 8840 8845 

Phe Thr Pro Val Leu Leu Lys Gin Cys Leu Gly Asn Met Pro Ala He 
8850 8855 8860 

He Ser Arg Leu Ser Val Leu Phe Asn Val Gly Asp Arg Leu Asp Ala 
8865 8870 8875 8880 

His Asp Ala Val Ala Ala Ser Gly Leu lie Gin Asp Ala Val Tyr Asn 

8885 8890 8895 

25 Ala Tyr Gly Pro Thr Glu Asn Gly Met Gin Ser Thr Met Tyr Lys Val 

8900 8905 8910 

Asp Val Asn Glu Pro Phe Val Asn Gly Val Pro He Gly Arg Ser He 
8915 8920 8925 

30 Thr Asn Ser Gly Ala Tyr Val Met Asp Gly Asn Gin Gin Leu Val Ser 

8930 8935 8940 

Pro Gly Val Met Gly Glu He Val Val Thr Gly Asp Gly Leu Ala Arg 
8945 8950 8955 8960 

35 Gly Tyr Thr Asp Ser Ala Leu Asp Glu Asp Arg Phe Val His Val Thr 

8965 8970 8975 

He Asp Gly Glu Glu Asn lie Lys Ala Tyr Arg Thr Gly Asp Arg Val 

8980 8985 8990 

40 Arg Tyr Arg Pro Lys Asp Phe Glu lie Glu Phe Phe Gly Arg Met Asp 

8995 9000 9005 

Gin Gin Val Lys He Arg Gly His Arg lie Glu Pro Ala Glu Val Glu 
9010 9015 9020 

45 His Ala Leu Leu Gly His Asp Leu Val His Asp Ala Ala Val Val Leu 

9025 9030 9035 9040 

Arg Lys Pro Ala Asn Gin Glu Pro Glu Met lie Ala Phe lie Thr Ser 

9045 9050 9055 

^ Gin Glu Asp Glu Thr lie Glu Gin His Glu Ser Asn Lys Gin Val Gin 

9060 9065 9070 

Gly Trp Gly Glu His Phe Asp Val Ser Arg Tyr Ala Asp lie Lys Asp 
9075 9080 9085 

« Leu Asp Thr Ser Thr Phe Gly His Asp Phe Leu Gly Trp Thr Ser Met 

9090 9095 9100 
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Tyr Asp Gly Val Asp lie Pro Val Asn Glu Met Lys Glu Trp Leu Asp 
9105 9110 9115 9120 

Glu Thr Thr Ala Ser Leu Leu Asp Asn Arg Pro Pro Gly His lie Leu 

9125 9130 9135 

Glu lie Gly Ala Gly Thr Gly Met lie Leu Ser Asn Leu Gly Lys Val 

9140 ~ 9145 9150 

Asp Gly Leu Gin Lys Tyr Val Gly Leu Asp Pro Ala Pro Ser Ala Ala 
9155 9160 9165 

lie Phe Val Asn Glu Ala Val Lys Ser Leu Pro Ser Leu Ala Gly Lys 
9170 9175 9180 

Ala Arg Val Leu Val Gly Thr Ala Leu Asp lie Gly Ser Leu Asp Lys 
15 9185 9190 9195 9200 

Asn Glu lie Gin Pro Glu Leu Val Val lie Asn Ser Val Ala Gin Tyr 

9205 9210 9215 

Phe Pro Thr Ser Glu Tyr Leu lie Lys Val Val Lys Ala Val Val Glu 
20 9220 9225 9230 

Val Pro Ser Val Lys Arg Val Phe Phe Gly Asp lie Arg Ser Gin Ala 
9235 " 9240 9245 

Leu Asn Arg Asp Phe Leu Ala Ala Arg Ala Val Arg Ala Leu Gly Asp 
25 9250 ~ 9255 9260 

Asn Ala Ser Lys Glu Gin lie Arg Glu Lys lie Ala Glu Leu Glu Glu 
9265 9270 9275 9280 

Ser Glu Glu Glu Leu Leu Val Asp Pro Ala Phe Phe Val Ser Leu Arg 
30 9285 9290 9295 

Ser Gin Leu Pro Asn lie Lys His Val Glu Val Leu Pro Lys Leu Met 

9300 9305 9310 

Lys Ala Thr Asn Glu Leu Ser Ser Tyr Arg Tyr Ala Ala Val Leu His 
35 9315 9320 9325 

lie Ser His Asn Glu Glu Glu Gin Leu Leu lie Gin Asp lie Asp Pro 
9330 9335 9340 
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Thr Ala Trp Val Asp Phe Ala Ala Thr Gin Lys Asp Ser Gin Gly Leu 
9345 9350 9355 9360 

Arg Asn Leu Leu Gin Gin Gly Arg Asp Asp Val Met lie Ala Val Gly 

9365 9370 9375 

Asn lie Pro Tyr Ser Lys Thr He Val Glu Arg His He Met Asn Ser 

9380 9385 9390 

Leu Asp Gin Asp His Val Asn Ser Leu Asp Gly Thr Ser Trp He Ser 
9395 9400 9405 

Asp Ala Arg Ser Ala Ala Ala He Cys Thr Ser Phe Asp Ala Pro Ala 
9410 9415 9420 

Leu Thr Gin Leu Ala Lys Glu Glu Gly Phe Arg Val Glu Leu Ser Trp 
9425 9430 9435 9440 

Ala Arg Gin Arg Ser Gin Asn Gly Ala Leu Asp Ala Val Phe His Arg 

9445 9450 9455 
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Leu Ala Thr Asp Ala Asn Cys Glu Arg Ser Arg Val Leu Val His Phe 

9460 9465 9470 

Pro Thr Asp His Gin Gly Arg Gin Leu Arg Thr Leu Thr Asn Arg Pro 
9475 9480 9485 

Leu Gin Arg Ala Gin Ser Arg Arg lie Glu Ser Gin Val Phe Glu Ala 
9490 9495 9500 

Leu Gin Thr Ala Leu Pro Ala Tyr Met He Pro Ser Arg He He Val 
9505 9510 9515 9520 

Leu Pro Gin Met Pro Thr Asn Ala Asn Gly Lys Val Asp Arg Lys Gin 

9525 9530 * 9535 

Leu Ala Arg Arg Ala Gin Val Val Ala Lys Arg Lys Ala Val Ser Ala 

9540 9545 " 9550 

Arg Val Ala Pro Arg Asn Asp Thr Glu lie Val Leu Cys Glu Glu Tyr 
9555 9560 9565 

Ala Asp lie Leu Gly Thr Glu Val Gly lie Thr Asp Asn Phe Phe Asp 
9570 9575 9580 

Met Gly Gly His Ser Leu Met Ala Thr Lys Leu Ala Ala Arg Leu Ser 
9585 9590 9595 9600 

Arg Arg Leu Asp Thr Arg Val Thr Val Lys Glu Val Phe Asp Lys Pro 

9605 9610 9615 

Val Leu Ala Asp Leu Ala Ala Ser He Glu Gin Gly Ser Thr Pro His 

9620 9625 ~ 9630 

Leu Pro He Ala Ser Ser Val Tyr Ser Gly Pro Val Glu Gin Ser Tyr 
9635 9640 9645 

Ala Gin Gly Arg Leu Trp Phe Leu Asp Gin Phe Asn Leu Asn Ala Thr 
9650 9655 9660 

Trp Tyr His Met Ser Leu Ala Met Arg Leu Leu Gly Pro Leu Asn Met 
9665 9670 9675 "* 9680 

Asp Ala Leu Asp Val Ala Leu Arg Ala Leu Glu Gin Arg His Glu Thr 

9685 9690 9695 

Leu Arg Thr Thr Phe Glu Ala Gin Lys Asp He Gly Val Gin Val Val 

9700 9705 9710 

Hi 3 Glu Ala Gly Met Lys Arg Leu Lys Val Leu Asp Leu Ser Asp Lys 
9715 9720 9725 

Asn Glu Lys Glu His Met Ala Val Leu Glu Asn Glu Gin Met Arg Pro 
9730 9735 9740 

Phe Thr Leu Ala Ser Glu Pro Gly Trp Lys Gly His Leu Ala Aro Leu 
9745 9750 9755 9760 

Gly Pro Thr Glu Tyr lie Leu Ser Leu Val Met His His Met Phe Ser 

9765 9770 9775 

Asp Gly Trp Ser Val Asp He Leu Arg Gin Glu Leu Gly Gin Phe Tyr 

9780 9785 9790 

Ser Ala Ala Leu Arg Gly Arg Asp Pro Leu Ser Gin Val Lys Pro Leu 
9795 9800 9805 

Pro He Gin Tyr Arg Asp Phe Ala Ala Trp Gin Lys Glu Ala Ala Gin 



EP0 578 616 A2 



10 



15 



20 



25 



9810 9815 9820 

Val Ala Glu His Glu Arg Gin Leu Ala Tyr Trp Glu Asn Gin Leu Ala 
9825 9830 9835 9840 

Asp Ser Thr Pro Gly Glu Leu Leu Thr Asp Phe Pro Arg Pro Gin Phe 

9845 9850 9855 

Leu Ser Gly Lys Ala Gly Val He Pro Val Thr He Glu Gly Pro Val 

9860 9865 9870 

Tyr Glu Lys Leu Leu Lys Phe Ser Lys Glu Arg Gin Val Thr Leu Phe 
9875 " 9880 9885 

Ser Val Leu Leu Thr Ala Phe Arg Ala Thr His Phe Arg Leu Thr Gly 
9890 9895 9900 

Ala Glu Asp Ala Thr He Gly Thr Pro He Ala Asn Arg Asn Arg Pro 
9905 9910 9915 9920 

Glu Leu Glu His He He Gly Phe Phe Val Asn Thr Gin Cys Met Arg 

9925 9930 9935 

Leu Leu Leu Asp Thr Gly Ser Thr Phe Glu Ser Leu Val Gin His Val 

9940 9945 9950 

Arg Ser Val Ala Thr Asp Ala Tyr Ser Asn Gin Asp He Pro Phe Glu 
9955 9960 9965 

Arg He Val Ser Ala Leu Leu Pro Gly Ser Arg Asp Ala Ser Arg Ser 
9970 9975 9980 

Pro Leu He Gin Leu Met Phe Ala Leu His Ser Gin Pro Asp Leu Gly 
9985 9990 9995 10000 

Asn He Thr Leu Glu Gly Leu Glu His Glu Arg Leu Pro Thr Ser Val 

10005 10010 10015 

Ala Thr Arg Phe Asp Met Glu Phe His Leu Phe Gin Glu Pro Asn Lys 

10020 10025 10030 

Leu Ser Gly Ser He Leu Phe Ala Asp Glu Leu Phe Gin Pro Glu Thr 
10035 10040 10045 

He Asn Ser Val Val Thr Val Phe Gin Glu He Leu Arg Arg Gly Leu 
10050 10055 10060 

ao Asp Gin Pro Gin Val Ser He Ser Thr Met Pro Leu Thr Asp Gly Leu 

10065 10070 10075 * 10080 

He Asp Leu Glu Lys Leu Gly Leu Leu Glu He Glu Ser Ser Asn Phe 

10085 10090 10095 

45 Pro Arg Asp Tyr Ser Val Val Asp Val Phe Arg Gin Gin Val Ala Ala 

10100 10105 10110 

Asn Pro Asn Ala Pro Ala Val Val Asp Ser Glu Thr Ser Met Ser Tyr 
10115 10120 10125 

50 Thr Ser Leu Asp Gin Lys Ser Glu Gin He Ala Ala Trp Leu His Ala 

10130 10135 10140 

Gin Gly Leu Arg Pro Glu Ser Leu He Cys Val Met Ala Pro Arg Ser 
10145 10150 10155 10160 

55 Phe Glu Thr He Val Ser Leu Phe Gly He Leu Lys Ala Gly Tyr Ala 

10165 10170 10175 
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Tyr Leu Pro Leu Asp Val Asn Ser Pro Ala Ala Arg He Gin Pro He 

10180 10185 10190 

Leu Ser Glu Val Glu Gly Lys Arg Leu Val Leu Leu Gly Ser Gly He 
10195 10200 10205 

Asp Met Pro Gin Ser Asp Arg Met Asp Val Glu Thr Ala Arg lie Gin 
10210 10215 10220 

Asp He Leu Thr Asn Thr Lys Val Glu Arg Ser Asp Pro Met Ser Arg 
10225 10230 10235 10240 

Pro Ser Ala Thr Ser Leu Ala Tyr Val He Phe Thr Ser Gly Ser Thr 

10245 10250 10255 

Gly Arg Pro Lys Gly Val Met He Glu His Arg Asn He Leu Arg Leu 

10260 10265 10270 

Val Lys Gin Ser Asn Val Thr Ser Gin Leu Pro Gin Asp Leu Arg Met 
10275 10280 10285 

Ala His He Ser Asn Leu Ala Phe Asp Ala Ser He Trp Glu He Phe 
10290 10295 10300 

Thr Ala He Leu Asn Gly Gly Ala Leu He Cys He Asp Tyr Phe Thr 
10305 10310 10315 " 10320 

Leu Leu Asp Ser Gin Ala Leu Arg Thr Thr Phe Glu Lys Ala Arg Val 

10325 10330 10335 

Asn Ala Thr Leu Phe Ala Pro Ala Leu Leu Lys Glu Cys Leu Asn His 

10340 10345 " 10350 

Ala Pro Thr Leu Phe Glu Asp Leu Lys Val Leu Tyr He Gly Gly Asp 
10355 10360 10365 

Arg Leu Asp Ala Thr Asp Ala Ala Lys He Gin Ala Leu Val Lys Gly 
10370 10375 10380 

Thr Val Tyr Asn Ala Tyr Gly Pro Thr Glu Asn Thr Val Met Ser Thr 
10385 10390 10395 10400 

He Tyr Arg Leu Thr Asp Gly Glu Ser Tyr Ala Asn Gly Val Pro He 

10405 10410 10415 

Gly Asn Ala Val Ser Ser Ser Gly Ala Tyr He Met Asp Gin Lys Gin 

10420 10425 10430 

Arg Leu Val Pro Pro Gly Val Met Gly Glu Leu Val Val Ser Gly Asp 
10435 10440 10445 

Gly Leu Ala Arg Gly Tyr Thr Asn Ser Thr Leu Asn Ala Asp Arg Phe 
10450 10455 10460 

Val Asp He Val He Asn Asp Gin Lys Ala Arg Ala Tyr Arg Thr Gly 
10465 10470 10475 10480 

Asp Arg Thr Arg Tyr Arg Pro Lys Asp Gly Ser He Glu Phe Phe Gly 

10485 10490 10495 

Arg Met Asp Gin Gin Val Lys He Arg Gly His Arg Val Glu Pro Ala 

10500 10505 10510 

Glu Val Glu Gin Ala Met Leu Gly Asn Lys Ala He His Asp Ala Ala 
10515 10520 10525 
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Val Val Val Gin Ala Val Asp Gly Gin Glu Thr Glu Met He Glv Phe 
10530 10535 10540 

Val Ser Met Ala Ser Asp Arg Phe Ser Glu Gly Glu Glu Glu He Thr 
10545 10550 10555 10560 

Asn Gin Val Gin Glu Trp Glu Asp His Phe Glu Ser Thr Ala Tyr Ala 

10565 10570 10575 

Gly He Glu Ala He Asp Gin Ala Thr Leu Gly Arg Asp Phe Thr Ser 

10580 10585 10590 

Trp Thr Ser Met Tyr Asn Gly Asn Leu He Asp Lys Ala Glu Met Glu 
10595 10600 " 10605 

Glu Trp Leu Asp Asp Thr Met Gin Ser Leu Leu Asp Lys Glu Asp Ala 
10610 10615 10620 

Arg Pro Cys Ala Glu He Gly Thr Gly Thr Gly Met Val Leu Phe Asn 
10625 10630 10635 10640 

Leu Pro Lys Asn Asp Gly Leu Glu Ser Tyr Val Gly He Glu Pro Ser 

10645 10650 10655 

Arg Ser Ala Ala Leu Phe Val Asp Lys Ala Ala Gin Asp Phe Pro Glv 

10660 10665 10670 

Leu Gin Gly Lys Thr Gin He Leu Val Gly Thr Ala Glu Asp He Lvs 
10675 10680 10685 

Leu val Lys Asp Phe His Pro Asp Val Val Val He Asn Ser Val Ala 
10690 10695 10700 

?i™? yr Phe Pro Ser Arg Ser Tvr Leu Val Gln He Ala Ser Glu Leu 
107 °5 10710 10715 10720 

He His Met Thr Ser Val Lys Thr He Phe Phe Gly Asp Met Arg Ser 

10725 10730 10735 

Trp Ala Thr Asn Arg Asp Phe Leu Val Ser Arg Ala Leu Tyr Thr Leu 

10740 10745 " 10750 

Gly Asp Lys Ala Thr Lys Asp Gin He Arg Gin Glu Val Ala Ara Leu 
1° 75 5 10760 10765 

Glu Glu Asn Glu Asp Glu Leu Leu Val Asp Pro Ala Phe Phe Thr Ser 
10770 10775 10780 

^ D ? hr Ser Gln Trp Pro Glv Lv3 Val L V 3 His Val Glu He Leu Pro 
10785 10790 10795 10800 

Lys Arg Met Arg Thr Ser Asn Glu Leu Ser Ser Tyr Arg Tyr Ala Ala 

10805 10810 10815 

Val Leu His He Cys Arg Asp Gly Glu Gly Arg Asn Arg Tyr Glv Aro 

10820 10825 10830 

Arg Val His Ser Val Glu Glu Asn Ala Trp He Asp Phe Ala Ser Ser 
10835 10840 10845 

Gly ?nnr^ Sp Arg His Ala Leu Val Gln Met Leu Asp Glu Arg Arg Asp 
10 350 10855 10860 

Ala Lys Thr Val Ala lie Gly Asn He Pro His Ser Asn Thr He Asn 
10865 10 *70 10875 10880 

Glu Arg His Phe Thr Thr Ser Leu Asp Thr Glu Gly Glu Gly He Ala 
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10885 10890 10895 

Gin Asp Ser Leu Asp Gly Ser Ala Trp Gin Ser Ala Thr Lys Ala Met 

10900 10905 10910 

Ala Ala Arg Cys Pro Cys Leu Ser Val Thr Glu Leu Val Glu lie Gly 
10915 10920 10925 

Gin Ala Ala Gly Phe Arg Val Glu Val Ser Trp Ala Arg Gin Arg Ser 
10930 10935 10940 

Gin His Gly Ala Leu Asp Val Val Phe His His Leu Glu Asp Asp Arg 
10945 10950 10955 10960 

Val Gly Arg Val Leu lie Asn Phe Pro Thr Asp Phe Glu Arg Leu Pro 

10965 10970 10975 

Pro Ser Thr Gly Leu Thr Ser Arg Pro Leu Gin Arg lie Gin Asn Arg 

10980 10985 10990 

Arg Phe Glu Ser Gin lie Arg Glu Gin Leu Gin Thr Leu Leu Pro Pro 
10995 11000 11005 

Tyr Met Val Pro Ser Arg lie Val Val Leu Glu Arg Met Pro Leu Asn 
11010 11015 11020 

Ala Asn Ser Lys Val Asp Arg Lys Glu Leu Ala Arg Lys Ala Arg Thr 
11025 11030 11035 11040 

Leu Gin Thr lie Lys Pro Ser Ala Thr Arg Val Ala Pro Arg Asn Asp 

11045 11050 11055 

lie Glu Ala Val Leu Cys Asp Glu Phe Gin Ala Val Leu Gly Val Thr 

11060 11065 11070 

Val Gly Val Met Asp Asn Phe Phe Glu Leu Gly Gly His Ser Leu Met 
11075 11080 11085 

Ala Thr Lys Leu Ala Ala Arg Leu Ser Arg Arg Leu Asp Thr Arg Val 
11090 11095 11100 

Ser Val Lys Asp lie Phe Asn Gin Pro lie Leu Gin Asp Leu Ala Asp 
11105 11110 11115 11120 

Val Val Gin Thr Gly Ser Ala Pro His Glu Ala lie Pro Ser Thr Pro 

11125 11130 11135 

Tyr Ser Gly Pro Val Glu Gin Ser Phe Ser Gin Gly Arg Leu Trp Phe 

11140 11145 ' 11150 

Leu Asp Gin Leu Asn Leu Asn Ala Ser Trp Tyr His Met Pro Leu Ala 
11155 11160 11165 

Ser Arg Leu Arg Gly Pro Leu Arg He Glu Ala Leu Gin Ser Ala Leu 
11170 11175 11180 

Ala Thr He Glu Ala Arg His Glu Ser Leu Arg Thr Thr Phe Glu Glu 
H185 11190 11195 11200 

Gin Asp Gly Val Pro Val Gin He Val Arg Ala Ala Arg Asn Lys Gin 

11205 11210 11215 

Leu Arg He He Asp Val Ser Gly Thr Glu Asp Ala Tyr Leu Ala Ala 

11220 11225 H230 

Leu Lys Gin Glu Gin Asp Ala Ala Phe Asp Leu Thr Ala Glu Pro Gly 
11235 11240 11245 
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Trp Arg Val Ala Leu Leu Arg Leu Gly Pro Asp Asp His Val Leu Ser 
11250 11255 11260 

lie Val Met His His lie lie Ser Asp Gly Trp Ser Val Asp lie Leu 
11265 11270 11275 112B0 

Arg Gin Glu Leu Gly Gin Leu Tyr Ser Asn Ala Ser Ser Gin Pro Ala 

11285 11290 11295 

Pro Leu Pro lie Gin Tyr Arg Asp Phe Ala lie Trp Gin Lys Gin Asp 

11300 11305 11310 

Ser Gin lie Ala Glu His Gin Lys Gin Leu Asn Tyr Trp Lys Arg Gin 
11315 11320 11325 

Leu Val Asn Ser Lys Pro Ala Glu Leu Leu Ala Asp Phe Thr Arg Pro 
11330 11335 11340 

Lys Ala Leu Ser Gly Asp Ala Asp Val lie Pro lie Glu lie Asp Asp 
11345 11350 11355 11360 

Gin Val Tyr Gin Asn Leu Arg Ser Phe Cys Arg Ala Arg His Val Thr 
20 11365 11370 11375 

Ser Phe Val Ala Leu Leu Ala Ala Phe Arg Ala Ala His Tyr Arg Leu 

11380 11385 11390 

Thr Gly Ala Glu Asp Ala Thr lie Gly Ser Pro lie Ala Asn Arg Asn 
25 11395 11400 11405 

Arg Pro Glu Leu Glu Gly Leu lie Gly Cys Phe Val Asn Thr Gin Cys 
11410 11415 11420 

Leu Arg lie Pro Val Lys Ser Glu Asp Thr Phe Asp Thr Leu Val Lys 
30 11425 11430 11435 11440 

Gin Ala Arg Glu Thr Ala Thr Glu Ala Gin Asp Asn Gin Asp Val Pro 

11445 11450 11455 
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Phe Glu Arg lie Val Ser Ser Met Val Ala Ser Ser Arg Asp Thr Ser 

11460 11465 11470 

Arg Asn Pro Leu Val Gin Val Met Phe Ala Val His Ser Gin His Asp 
11475 11480 11485 

Leu Gly Asn He Arg Leu Glu Gly Val Glu Gly Lys Pro Val Ser Met 
11490 11495 11500 

Ala Ala Ser Thr Arg Phe Asp Ala Glu Met His Leu Phe Glu Asp Gin 
11505 11510 11515 11520 

Gly Met Leu Gly Gly Asn Val Val Phe Ser Lys Asp Leu Phe Glu Ser 

11525 11530 11535 

Glu Thr He Arg Ser Val Val Ala Val Phe Gin Glu Thr Leu Arg Arg 

11540 11545 11550 

Gly Leu Ala Asn Pro His Ala Asn Leu Ala Thr Leu Pro Leu Thr Asp 
11555 11560 11565 

Gly Leu Pro Ser Leu Arg Ser Leu Cys Leu Gin Val Asn Gin Pro Asp 
11570 11575 11580 

Tyr Pro Arg Asp Ala Ser Val He Asp Val Phe Arg Glu Gin Val Ala 
11585 11590 11595 11600 
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Ser lie Pro Lys Ser lie Ala Val lie Asp Ala Ser Ser Gin Leu Thr 

11605 11610 11615 

Tyr Thr Glu Leu Asp Glu Arg Ser Ser Gin Leu Ala Thr Trp Leu Arg 

11620 11625 11630 

Arg Gin Val Thr Val Pro Glu Glu Leu Val Gly Val Leu Ala Pro Arg 
11635 11640 * 11645 

Ser Cys Glu Thr lie lie Ala Phe Leu Gly lie He Lys Ala Asn Leu 
11650 11655 11660 

Ala Tyr Leu Pro Leu Asp Val Asn Ala Pro Ala Gly Arg He Glu Thr 
11665 11670 11675 11680 

He Leu Ser Ser Leu Pro Gly Asn Arg Leu He Leu Leu Gly Ser Asp 

11685 11690 11695 

Thr Gin Ala Val Lys Leu His Ala Asn Ser Val Arg Phe Thr Arg He 

11700 11705 11710 

Ser Asp Ala Leu Val Glu Ser Gly Ser Pro Pro Thr Glu Glu Leu Ser 
11715 11720 11725 

Thr Arg Pro Thr Ala Gin Ser Leu Ala Tyr Val Met Phe Thr Ser Gly 
11730 11735 11740 

Ser Thr Gly Val Pro Lys Gly Val Met Val Glu His Arg Gly He Thr 
H745 11750 11755 " 11760 

Arg Leu Val Lys Asn Ser Asn Val Val Ala Lys Gin Pro Ala Ala Ala 

11765 11770 11775 

Ala He Ala His Leu Ser Asn He Ala Phe Asp Ala Ser Ser Trp Glu 

11780 11785 11790 

He Tyr Ala Pro Leu Leu Asn Gly Gly Thr Val Val Cys He Asp Tyr 
11795 11800 11805 

Tyr Thr Thr He Asp He Lys Ala Leu Glu Ala Val Phe Lys Gin His 
11810 11815 11820 

His He Arg Gly Ala Met Leu Pro Pro Ala Leu Leu Lys Gin Cys Leu 
H825 11830 11835 11840 

Val Ser Ala Pro Thr Met He Ser Ser Leu Glu He Leu Phe Ala Ala 

11845 11850 11855 

Gly Asp Arg Leu Ser Ser Gin Asp Ala He Leu Ala Arg Arg Ala Val 

11860 11865 11870 

Gly Ser Gly Val Tyr Asn Ala Tyr Gly Pro Thr Glu Asn Thr Val Leu 
11875 11880 11885 

Ser Thr He His Asn He Gly Glu Asn Glu Ala Phe Ser Asn Gly Val 
11890 11895 11900 

Pro He Gly Asn Ala Val Ser Asn Ser Gly Ala Phe Val Met Asp Gin 
11905 11910 11915 11920 

Asn Gin Gin Leu Val Ser Ala Gly Val He Gly Glu Leu Val Val Thr 

11925 11930 H935 

Gly Asp Gly Leu Ala Arg Gly Tyr Thr Asp Ser Lys Leu Arg Val Asp 

11940 11945 11950 

Arg Phe He Tyr He Thr Leu Asp Gly Asn Arg Val Arg Ala Tyr Arg 
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11955 11960 11965 

Thr Gly Asp Arg Val Arg His Arg Pro Lys Asp Gly Gin lie Glu Phe 
11970 11975 11980 

Phe Gly Arg Met Asp Gin Gin lie Lys He Arg Gly His Arg He Glu 
11985 11990 11995 12000 

Pro Ala Glu Val Glu Gin Ala Leu Ala Arg Asp Pro Ala He Ser Asp 

12005 12010 12015 

Ser Ala Val He Thr Gin Leu Thr Asp Glu Glu Glu Pro Glu Leu Val 

12020 12025 12030 

Ala Phe Phe Ser Leu Lys Gly Asn Ala Asn Gly Thr Asn Gly Val Asn 
12035 12040 " 12045 

Gly Val Ser Asp Gin Glu Lys He Asp Gly Asp Glu Gin His Ala Leu 
12050 12055 12060 

Leu Met Glu Asn Lys He Arg His Asn Leu Gin Ala Leu Leu Pro Thr 
12065 12070 12075 12080 

Tyr Met He Pro Ser Arg He He His Val Asp Gin Leu Pro Val Asn 

12085 12090 12095 

Ala Asn Gly Lys He Asp Arg Asn Glu Leu Ala Val Arg Ala Gin Ala 

12100 12105 12110 

Thr Pro Arg Thr Ser Ser Val Ser Thr Tyr Val Ala Pro Arg A3n Asp 
12115 12120 12125 

He Glu Thr He He Cys Lys Glu Phe Ala Asp He Leu Ser Val Arg 
12130 12135 12140 

Val Gly He Thr Asp Asn Phe Phe Asp Leu Gly Gly His Ser Leu He 
12145 12150 12155 12160 

Ala Thr Lys Leu Ala Ala Arg Leu Ser Arg Arg Leu Asp Thr Arg Val 

12165 12170 12175 

Ser Val Arg Asp Val Phe Asp Thr Pro Val Val Gly Gin Leu Ala Ala 

12180 * 12185 12190 

Ser He Gin Gin Gly Ser Thr Pro His Glu Ala He Pro Ala Leu Ser 
12195 12200 12205 

40 His Ser Gly Pro Val Gin Gin Ser Phe Ala Gin Gly Arg Leu Trp Phe 

12210 12215 12220 

Leu Asp Arg Phe Asn Leu Asn Ala Ala Trp Tyr He Met Pro Phe Gly 
12225 12230 12235 12240 

45 val Arg Leu Arg Gly Pro Leu Arg Val Asp Ala Leu Gin Thr Ala Leu 

12245 12250 12255 

Arg Ala Leu Glu Glu Arg His Glu Leu Leu Arg Thr Thr Phe Glu Glu 

12260 12265 12270 
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Gin Asp Gly Val Gly Met Gin He Val His Ser Pro Arg Met Arg Asp 
12275 12280 12285 

He Cys Val Val Asp He Ser Gly Ala Asn Glu Asp Leu Ala Lys Leu 
12290 12295 " 12300 

Lys Glu Glu Gin Gin Ala Pro Phe Asn Leu Ser Thr Glu Val Ala Trp 
12305 12310 12315 12320 
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Arg Val Ala Leu Phe Lys Ala Gly Glu Asn His His He Leu Ser lie 

12325 12330 12335 

Val Met His His He He Ser Asp Gly Trp Ser Val Asp He Phe Gin 
5 12340 12345 12350 

Gin Glu Leu Ala Gin Phe Tyr Ser Val Ala Val Arg Gly His Asp Pro 
12355 12360 12365 

Leu Ser Gin Val Lys Pro Leu Pro lie His Tyr Arg Asp Phe Ala Val 
10 12370 12375 12380 

Trp Gin Arg Gin Asp Lys Gin Val Ala Val His Glu Ser Gin Leu Gin 
12385 12390 12395 12400 

Tyr Trp He Glu Gin Leu Ala Asp Ser Thr Pro Ala Glu He Leu Ser 
15 12405 12410 12415 

Asp Phe Asn Arg Pro Glu Val Leu Ser Gly Glu Ala Gly Thr Val Pro 

12420 12425 12430 

He Val He Glu Asp Glu Val Tyr Glu Lys Leu Ser Leu Phe Cys Arg 
20 12435 12440 12445 

Asn His Gin Val Thr Ser Phe Val Val Leu Leu Ala Ala Phe Arg Val 
12450 12455 12460 

Ala His Tyr Arg Leu Thr Gly Ala Glu Asp Ala Thr He Gly Thr Pro 
25 12465 12470 12475 12480 

He Ala Asn Arg Asn Arg Pro Glu Leu Glu Asp Leu He Gly Phe Phe 

12485 12490 12495 
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Val Asn Thr Gin Cys Met Arg He Ala Leu Glu Glu His Asp Asn Phe 

12500 12505 12510 

Leu Ser Val Val Arg Arg Val Arg Ser Thr Ala Ala Ser Ala Phe Glu 
12515 12520 12525 

Asn Gin Asp Val Pro Phe Glu Arg Leu Val Ser Ala Leu Leu Pro Glv 
12530 12535 12540 

Ser Arg Asp Ala Ser Arg Asn Pro Leu Val Gin Leu Met Phe Val Val 
1254 * 12550 12555 12560 

His Ser Gin Arg Asn Leu Gly Lys Leu Gin Leu Glu Gly Leu Glu Gly 

12565 12570 12575 

Glu Pro Thr Pro Tyr Thr Ala Thr Thr Arg Phe Asp Val Glu Phe His 

12580 12585 12590 

Leu Phe Glu Gin Asp Lys Gly Leu Ala Gly Asn Val Val Phe Ala Ala 
12595 12600 12605 

Asp Leu Phe Glu Ala Ala Thr He Arg Ser Val Val Glu Val Phe His 
12610 12615 " 12620 

Glu He Leu Arg Arg Gly Leu Asp Gin Pro Asp He Ala He Ser Thr 
12625 12630 12635 12640 

Met Pro Leu Val Asp Gly Leu Ala Ala Leu Asn Ser Arg Asn Leu Pro 

12645 12650 12655 

Ala Val Glu Asp He Glu Pro Asp Phe Ala Thr Glu Ala Ser Val Val 

12660 12665 12670 
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Asp Val Phe Gin Thr Gin Val Val Ala Asn Pro Asp Ala Leu Ala Val 
12675 12680 12685 

Thr Asp Thr Ser Thr Lys Leu Thr Tyr Ala Glu Leu Asp Gin Gin Ser 
12690 12695 12700 

Asp His Val Ala Ala Trp Leu Ser Lys Gin Lys Leu Pro Ala Glu Ser 
12705 12710 12715 12720 

lie Val Val Val Leu Ala Pro Arg Ser Ser Glu Thr lie Val Ala Cys 

12725 12730 12735 

lie Gly lie Leu Lys Ala Asn Leu Ala Tyr Leu Pro Met Asp Ser Asn 

12740 12745 12750 

Val Pro Glu Ala Arg Arg Gin Ala lie Leu Ser Glu lie Pro Gly Glu 
12755 12760 12765 

Lys Phe Val Leu Leu Gly Ala Gly Val Pro lie Pro Asp Asn Lys Thr 
12770 12775 12780 

Ala Asp Val Arg Met Val Phe lie Ser Asp lie Val Ala Ser Lys Thr 
12785 12790 12795 12800 

Asp Lys Ser Tyr Ser Pro Gly Thr Arg Pro Ser Ala Ser Ser Leu Ala 

12805 12810 12815 

Tyr Val He Phe Thr Ser Gly Ser Thr Gly Arg Pro Lys Gly Val Met 

12820 12825 12830 

Val Glu His Arg Gly Val He Ser Leu Val Lys Gin Asn Ala Ser Arg 
12835 12840 12845 

lie Pro Gin Ser Leu Arg Met Ala His Val Ser Asn Leu Ala Phe Asp 
12850 12855 12860 

Ala Ser val Trp Glu He Phe Thr Thr Leu Leu Asn Gly Gly Thr Leu 
12865 12870 12875 12880 

Phe Cys He Ser Tyr Phe Thr Val Leu Asp Ser Lys Ala Leu Ser Ala 

12885 12890 12895 

Ala Phe Ser Asp His Arg He Asn He Thr Leu Leu Pro Pro Ala Leu 

12900 12905 12910 

Leu Lys Gin Cys Leu Ala Asp Ala Pro Ser Val Leu Ser Ser Leu Glu 
12915 12920 12925 

Ser Leu Tyr lie Gly Gly Asp Arg Leu Asp Gly Ala Asp Ala Thr Lys 
12930 12935 12940 

Val Lys Asp Leu Val Lys Gly Lys Ala Tyr Asn Ala Tyr Gly Pro Thr 
12945 12950 12955 12960 

Glu Asn Ser Val Met Ser Thr He Tyr Thr lie Glu His Glu Thr Phe 

12965 12970 12975 

Ala Asn Gly Val Pro lie Gly Thr Ser Leu Gly Pro Lys Ser Lys Ala 

12980 12985 ~ 12990 

Tyr lie Met Asp Gin Asp Gin Gin Leu Val Pro Ala Gly Val Met Gly 
12995 13000 13005 

Glu Leu Val Val Ala Gly Asp Gly Leu Ala Arg Gly Tyr Thr Asp Pro 
13010 13015 13020 

Ser Leu Asn Thr Gly Arg Phe lie His He Thr lie Asp Gly Lys Gin 



77 



EP0 578 616 A2 



10 



15 



20 



25 



13025 13030 13035 13040 

Val Gin Ala Tyr Arg Thr Gly Asp Arg Val Arg Tyr Arg Pro Arg Asp 

13045 13050 13055 

Tyr Gin lie Glu Phe Phe Gly Arg Leu Asp Gin Gin He Lys He Arg 

13060 13065 13070 

Gly His Arg He Glu Pro Ala Glu Val Glu Gin Ala Leu Leu Ser Asp 
13075 13080 13085 

Ser Ser He Asn Asp Ala Val Val Val Ser Ala Gin Asn Lys Glu Glv 
13090 13095 13100 

Leu Glu Met Val Gly Tyr He Thr Thr Gin Ala Ala Gin Ser Val Asp 
13105 13110 13115 13120 

Lys Glu Glu Ala Ser Asn Lys Val Gin Glu Trp Glu Ala His Phe Asp 

13125 13130 13135 

Ser Thr Ala Tyr Ala Asn He Gly Gly He Asp Arg Asp Ala Leu Glv 

13140 13145 13150 

Gin Asp Phe Leu Ser Trp Thr Ser Met Tyr Asp Gly Ser Leu He Pro 
13155 13160 ' 13165 

Arg Glu Glu Met Gin Glu Trp Leu Asn Asp Thr Met Arg Ser Leu Leu 
13170 13175 13180 

Asp Asn Gin Pro Pro Gly Lys Val Leu Glu He Gly Thr Glv Thr Glv 
13185 13190 13195 13200 

Met Val Leu Phe Asn Leu Gly Lys Val Glu Gly Leu Gin Ser Tyr Ala 

13205 13210 13215 

Gly Leu Glu Pro Ser Arg Ser Val Thr Ala Trp Val Asn Lys Ala He 

13220 13225 13230 

Glu Thr Phe Pro Ser Leu Ala Gly Ser Ala Arg Val His Val Glv Thr 
13235 13240 13245 

Ala Glu Asp He Ser Ser He Asp Gly Leu Arg Ser Asp Leu Val Val 
13250 13255 13260 

He Asn Ser Val Ala Gin Tyr Phe Pro Ser Arg Glu Tyr Leu Ala Glu 
13 265 13270 13275 13280 

^ Leu Thr Ala Asn Leu He Arg Leu Pro Gly Val Lys Arg He Phe Phe 

13285 13290 13295 

Gly Asp Met Arg Thr Tyr Ala Thr Asn Lys Asp Phe Leu Val Ala Ara 

13300 13305 13310 

^ Ala Val His Thr Leu Gly Ser Asn Ala Ser Lys Ala Met Val Ara Gin 

13315 13320 13325 

Gin Val Ala Lys Leu Glu Asp Asp Glu Glu Glu Leu Leu Val Asp Pro 
13330 13335 13340 

50 ^?.! he Phe Thr Ser LeU Ser Gln Phe Pro A 3P Glu He Lys His 

13345 13350 13355 * 13360 

Val Glu He Leu Pro Lys Arg Met Ala Ala Thr Asn Glu Leu Ser Ser 

13365 13370 13375 

Tyr Arg Tyr Ala Ala Val He His Val Gly Gly His Gin Met Pro Asn 

13380 13385 13390 
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Gly Glu Asp Glu Asp Lys Gin Trp Ala Val Lys Asp lie Asn Pro Lys 
13395 13400 "* 13405 

Ala Trp Val Asp Phe Ala Gly Thr Arg Met Asp Arg Gin Ala Leu Leu 
13410 13415 13420 

Gin Leu Leu Gin Asp Arg Gin Arg Gly Asp Asp Val Val Ala Val Ser 
13425 13430 13435 13440 

Asn lie Pro Tyr Ser Lys Thr He Met Glu Arg His Leu Ser Gin Ser 

13445 13450 13455 

Leu Asp Asp Asp Glu Asp Gly Thr Ser Ala Val Asp Gly Thr Ala Trp 

13460 13465 13470 

He Ser Arg Thr Gin Ser Arg Ala Lys Glu Cys Pro Ala Leu Ser Val 
13475 13480 13485 

Ala Asp Leu He Glu He Gly Lys Gly lie Gly Phe Glu Val Glu Ala 
13490 13495 13500 

Ser Trp Ala Arg Gin His Ser Gin Arg Gly Gly Leu Asp Ala Val Phe 
!3505 13510 13515 13520 

His Arg Phe Glu Pro Pro Arg His Ser Gly His Val Met Phe Arg Phe 

13525 13530 13535 

Pro Thr Glu His Lys Gly Arg Ser Ser Ser Ser Leu Thr Asn Arg Pro 

13540 13545 13550 

Leu His Leu Leu Gin Ser Arg Arg Leu Glu Ala Lys Val Arg Glu Aro 
13555 13560 13565 

Leu Gin Ser Leu Leu Pro Pro Tyr Met He Pro Ser Arg He Thr Leu 
13570 13575 13580 

Leu Asp Gin Met Pro Leu Thr Ser Asn Gly Lys Val Asp Arg Lys Lvs 
13585 13590 13595 13600 

Leu Ala Arg Gin Ala Arg Val He Pro Arg Ser Ala Ala Ser Thr Leu 

13605 13610 13615 

Asp Phe val Ala Pro Arg Thr Glu He Glu Val Val Leu Cys Glu Glu 

13620 13625 13630 

Phe Thr Asp Leu Leu Gly Val Lys Val Gly He Thr Asp Asn Phe Phe 
13635 13640 13645 

Glu Leu Gly Gly His Ser Leu Leu Ala Thr Lys Leu Ser Ala Arg Leu 
13650 13655 13660 

Ser Arg Arg Leu Asp Ala Gly He Thr Val Lys Gin Val Phe Asp Gin 
13665 13670 13675 13680 

Pro Val Leu Ala Asp Leu Ala Ala Ser He Leu Gin Gly Ser Ser Arg 

13685 13690 13695 

His Arg Ser He Pro Ser Leu Pro Tyr Glu Gly Pro Val Glu Gin Ser 

13700 13705 ~ 13710 

Phe Ala Gin Gly Arg Leu Trp Phe Leu Asp Gin Phe Asn He Asp Ala 
13715 13720 13725 

Leu Trp Tyr Leu He Pro Phe Ala Leu Arg Met Arg Gly Pro Leu Gin 
13730 13735 13740 
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Val Asp Ala Leu Ala Ala Ala Leu Val Ala Leu Glu Glu Arg His Glu 
13745 13750 13755 13760 

Ser Leu Arg Thr Thr Phe Glu Glu Arg Asp Gly Val Gly lie Gin Val 

13765 13770 13775 

Val Gin Pro Leu Arg Thr Thr Lys Asp lie Arg lie lie Asp Val Ser 

13780 13785 13790 

Gly Met Arg Asp Asp Asp Ala Tyr Leu Glu Pro Leu Gin Lys Glu Gin 
13795 13800 13805 

Gin Thr Pro Phe Asp Leu Ala Ser Glu Pro Gly Trp Arg Val Ala Leu 
13810 13815 13820 

Leu Lys Leu Gly Lys Asp Asp His He Leu Ser lie Val Met His His 
13825 13830 13835 13840 

He He Ser Asp Gly Trp Ser Thr Glu Val Leu Gin Arg Glu Leu Gly 

13845 13850 13855 

Gin Phe Tyr Leu Ala Ala Lys Ser Gly Lys Ala Pro Leu Ser Gin Val 

13860 13865 13870 

Ala Pro Leu Pro lie Gin Tyr Arg Asp Phe Ala Val Trp Gin Arg Gin 
13875 13880 13885 

Glu Glu Gin Val Ala Glu Ser Gin Arg Gin Leu Asp Tyr Trp Lys Lys 
13890 13895 13900 

Gin Leu Ala Asp Ser Ser Pro Ala Glu Leu Leu Ala Asp Tyr Thr Arg 
13905 13910 13915 13920 

Pro Asn Val Leu Ser Gly Glu Ala Gly Ser Val Ser Phe Val He Asn 

13925 13930 13935 

Asp Ser Val Tyr Lys Ser Leu Val Ser Phe Cys Arg Ser Arg Gin Val 

13940 13945 ^ " 13950 

Thr Thr Phe Thr Thr Leu Leu Ala Ala Phe Arg Ala Ala His Tyr Arg 
13955 13960 13965 

Met Thr Gly Ser Asp Asp Ala Thr He Gly Thr Pro He Ala Asn Arg 
13970 13975 13980 

Asn Arg Pro Glu Leu Glu Asn Leu He Gly Cys Phe Val Asn Thr Gin 
13985 13990 13995 14000 

Cys Met Arg He Thr He Gly Asp Asp Glu Thr Phe Glu Ser Leu Val 

14005 14010 14015 

Gin Gin Val Arg Ser Thr Thr Ala Thr Ala Phe Glu Asn Gin Asp Val 

14020 14025 14030 

Pro Phe Glu Arg He Val Ser Thr Leu Ser Ala Gly Ser Arg Asp Thr 
14035 14040 14045 

Ser Arg Asn Pro Leu Val Gin Leu Leu Phe Ala Val His Ser Gin Gin 
14050 14055 14060 

Gly Leu Gly Arg He Gin Leu Asp Gly Val Val Asp Glu Pro Val Leu 
14065 14070 14075 14080 

Ser Thr Val Ser Thr Arg Phe Asp Leu Glu Phe His Ala Phe Gin Glu 

14085 14090 14095 

55 Ala Asp Arg Leu Asn Gly Ser Val Met Phe Ala Thr Asp Leu Phe Gin 
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14100 14105 14110 

Pro Glu Thr He Gin Gly Phe Val Ala Val Val Glu Glu Val Leu Gin 
14115 14120 14125 

Arg Gly Leu Glu Gin Pro Gin Ser Pro He Ala Thr Met Pro Leu Ala 
14130 14135 14140 

Glu Gly He Ala Gin Leu Arg Asp Ala Gly Ala Leu Gin Met Pro Lys 
14145 14150 14155 14160 

Ser Asp Tyr Pro Arg Asn Ala Ser Leu Val Asp Val Phe Gin Gin Gin 

14165 14170 14175 

Ala Met Ala Ser Pro Ser Thr Val Ala Val Thr Asp Ser Thr Ser Lys 

14180 14185 * 14190 

Leu Thr Tyr Ala Glu Leu Asp Arg Leu Ser Asp Gin Ala Ala Ser Tyr 
14195 14200 14205 

Leu Arg Arg Gin Gin Leu Pro Ala Glu Thr Met Val Ala Val Leu Ala 
14210 14215 14220 

20 Pro Arg Ser Cys Glu Thr He He Ala Phe Leu Ala lie Leu Lys Ala 

14225 14230 14235 14240 

Asn Leu Ala Tyr Met Pro Leu Asp Val Asn Thr Pro Ser Ala Arg Met 

14245 14250 14255 

25 Glu Ala He He Ser Ser Val Pro Gly Arg Arg Leu He Leu Val Gly 

14260 14265 14270 

Ser Gly Val Arg His Ala Asp He Asn Val Pro Asn Ala Lys Thr Met 
14275 14280 14285 

30 Leu He Ser Asp Thr Val Thr Gly Thr Asp' Ala He Gly Thr Pro Glu 

14290 14295 * 14300 

Pro Leu Val Val Arg Pro Ser Ala Thr Ser Leu Ala Tyr Val He Phe 
14305 14310 14315 14320 

35 Thr Ser Gly Ser Thr Gly Lys Pro Lys Gly Val Met Val Glu His Arg 

14325 14330 14335 

Ala He Met Arg Leu Val Lys Asp Ser Asn Val Val Thr His Met Pro 

14340 14345 14350 

Pro Ala Thr Arg Met Ala His Val Thr Asn He Ala Phe Asp Val Ser 
14355 14360 14365 

Leu Phe Glu Met Cys Ala Thr Leu Leu Asn Gly Gly Thr Leu Val Cys 
14370 14375 14380 

He Asp Tyr Leu Thr Leu Leu Asp Ser Thr Met Leu Arg Glu Thr Phe 
45 14385 14390 14395 14400 

Glu Arg Glu Gin Val Arg Ala Ala He Phe Pro Pro Ala Leu Leu Arg 

14405 14410 14415 

Gin Cys Leu Val Asn Met Pro Asp Ala He Gly Met Leu Glu Ala Val 
50 14420 14425 14430 

Tyr Val Ala Gly Asp Arg Phe His Ser Arg Asp Ala Arg Ala Thr Gin 
14435 14440 14445 

Ala Leu Ala Gly Pro Arg Val Tyr Asn Ala Tyr Gly Pro Thr Glu Asn 
55 14450 14455 14460 
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Ala lie Leu Ser Thr lie Tyr Asn lie Asp Lys His Asp Pro Tyr Val 
14465 14470 14475 14480 

Asn Gly Val Pro lie Gly Ser Ala Val Ser Asn Ser Gly Ala Tyr Val 

14485 14490 14495 

Met Asp Arg Asn Gin Gin Leu Leu Pro Pro Gly Val Met Gly Glu Leu 

14500 14505 14510 

Val Val Thr Gly Glu Gly Val Ala Arg Gly Tyr Thr Asp Ala Ser Leu 
14515 14520 14525 

Asp Thr Asp Arg Phe Val Thr Val Thr lie Asp Gly Gin Arg Gin Arg 
14530 ~ 14535 14540 

Ala Tyr Arg Thr Gly Asp Arg Val Arg Tyr Arg Pro Lys Gly Phe Gin 
14545 14550 14555 14560 

lie Glu Phe Phe Gly Arg Leu Asp Gin Gin Ala Lys lie Arg Gly His 

14565 14570 14575 

Arg Val Glu Leu Gly Glu Val Glu His Ala Leu Leu Ser Glu Asn Ser 

14580 14585 14590 

Val Thr Asp Ala Ala Val Val Leu Arg Thr Met Glu Glu Glu Asp Pro 
14595 14600 14605 

Gin Leu Val Ala Phe Val Thr Thr Asp His Glu Tyr Arg Ser Gly Ser 
14610 14615 14620 

Ser Asn Glu Glu Glu Asp Pro Tyr Ala Thr Gin Ala Ala Gly Asp Met 
14625 14630 14635 14640 

Arg Lys Arg Leu Arg Ser Leu Leu Pro Tyr Tyr Met Val Pro Ser Arg 

14645 14650 14655 

Val Thr lie Leu Arg Gin Met Pro Leu Asn Ala Asn Gly Lys Val Asp 

14660 14665 14670 

Arg Lys Asp Leu Ala Arg Arg Ala Gin Met Thr Pro Thr Ala Ser Ser 
14675 14680 14685 

Ser Gly Pro Val His Val Ala Pro Arg Asn Glu Thr Glu Ala Ala lie 
14690 14695 14700 

Cys Asp Glu Phe Glu Thr lie Leu Gly Val Lys Val Gly lie Thr Asp 
14705 14710 14715 14720 

Asn Phe Phe Glu Leu Gly Gly His Ser Leu Leu Ala Thr Lys Leu Ala 

14725 14730 14735 

Ala Arg Leu Ser Arg Arg Met Gly Leu Arg lie Ser Val Lys Asp Leu 

14740 14745 14750 

Phe Asp Asp Pro Val Pro Val Ser Leu Ala Gly Lys Leu Giu Gin Gin 
14755 14760 14765 

Gin Gly Phe Ser Gly Glu Asp Glu Ser Ser Thr Val Gly lie Val Pro 
14770 14775 14780 

Phe Gin Leu Leu Pro Ala Glu Met Ser Arg Glu lie lie Gin Arg Asp 
14785 14790 14795 14800 

Val Val Pro Gin lie Glu Asn Gly His Ser Thr Pro Leu Asp Met Tyr 

14805 14810 14815 
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Pro Ala Thr Gin Thr Gin He Phe Phe Leu His Asp Lys Ala Thr Gly 

14820 14825 14830 

His Pro Ala Thr Pro Pro Leu Phe Ser Leu Asp Phe Pro Glu Thr Ala 
14835 14840 14845 

Asp Cys Arg Arg Leu Ala Ser Ala Cys Ala Ala Leu Val Gin His Phe 
14850 14855 14860 

Asp He Phe Arg Thr Val Phe Val Ser Arg Gly Gly Arg Phe Tyr Gin 
14865 14870 14875 14880 

Val Val Leu Ala His Leu Asp Val Pro Val Glu Val He Glu Thr Glu 

14885 14890 14895 

Gin Glu Leu Asp Glu Val Ala Leu Ala Leu His Glu Ala Asp Lys Gin 

14900 14905 14910 

Gin Pro Leu Arg Leu Gly Arg Ala Met Leu Arg He Ala He Leu Lys 
14915 14920 14925 

Arg Pro Gly Ala Lys Met Arg Leu Val Leu Arg Met Ser His Ser Leu 
14930 14935 14940 

Tyr Asp Gly Leu Ser Leu Glu His He Val Asn Ala Leu His Ala Leu 
14945 14950 14955 14960 

Tyr Ser Asp Lys His Leu Ala Gin Ala Pro Lys Phe Gly Leu Tyr Met 

14965 14970 14975 

His His Met Ala Ser Arg Arg Ala Glu Gly Tyr Asn Phe Trp Arg Ser 

14980 14985 14990 

He Leu Gin Gly Ser Ser Met Thr Ser Leu Lys Arg Ser Val Gly Ala 
14995 15000 15005 

Leu Glu Ala Met Thr Pro Ser Ala Gly Thr Trp Gin Thr Ser Lys Ser 
15010 15015 15020 

He Arg He Pro Pro Ala Ala Leu Lys Asn Gly He Thr Gin Ala Thr 
15025 15030 15035 15040 

Leu Phe Thr Ala Ala Val Ser Leu Leu Leu Ala Lys His Thr Lys Ser 

15045 15050 15055 

Thr Asp Val Val Phe Gly Arg Val Val Ser Gly Arg Gin Asp Leu Ser 

15060 15065 15070 

lie Asn Cys Gin Asp He Val Gly Pro Cys He Asn Glu Val Pro Val 
15075 15080 15085 

Arg Val Arg He Asp Glu Gly Asp Asp Met Gly Gly Leu Leu Arg Ala 
15090 15095 15100 

He Gin Asp Gin Tyr Thr Ser Ser Phe Arg His Glu Thr Leu Gly Leu 
15105 15110 15115 15120 

Gin Glu Val Lys Glu Asn Cys Thr Asp Trp Thr Asp Ala Thr Lys Glu 

15125 15130 15135 

Phe Ser Cys Cys He Ala Phe Gin Asn Leu Asn Leu His Pro Glu Ala 

15140 15145 15150 

Glu He Glu Gly Gin Gin He Arg Leu Glu Gly Leu Pro Ala Lys Asp 
15155 15160 15165 

Gin Ala Arg Gin Ala Asn Gly His Ala Pro Asn Gly Thr Asn Gly Thr 
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15170 15175 15180 

Asn Gly Thr Asn Gly Thr Asn Gly Ala Asn Gly Thr Asn Gly Thr Asn 
15185 15190 15195 15200 

Gly Thr Asn Gly Thr His Ala Asn Gly He Asn Gly Ser Asn Gly Val 

15205 15210 15215 

Asn Gly Arg Asp Ser Asn Val Val Ser Ala Ala Gly Asp Gin Ala Pro 
!5220 15225 i 5 230 

Val His Asp Leu Asp He Val Gly He Pro Glu Pro Asp Gly Ser Val 
15235 15240 15245 

Lys He Gly He Gly Ala Ser Arg Gin He Leu Gly Glu Lys Val Val 
!5250 15255 15260 

Gly Ser Met Leu Asn Glu Leu Cys Glu Thr Met Leu Ala Leu Ser Aro 
15265 15270 15275 15280 

Thr 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 178 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL : NO 

(iii) ANT I— SENSE : NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tolypocladium geodes 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATGCAACTAT CGGCTCTCCA ATTGCGAACA GAAATCGAGC AGAGCTTGAG GGCCTTATTG 60 
GCTGTTTTGT GAATACTCAG TGTATGAGAC TGCCAGTTAC CGATGAAGAT ACATTCGCCA 120 
ATTTGATTGA CTGTGTACGA GAGACGTCAA CCGAGGCCTT GAGCACCAAG ATATCCTT 178 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1713 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Neocosmospora vasinfecta 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



ACATCGGGGG 


TATTGATCGC 


GATGCCCTCG 


GACAGGACTT 


CTTATCCTGG 


ACATCC AT GT 


60 


ACGACGGCTC 


ATTGATTCCC 


CGGGAAGAGA 


TGCAGGAATG 


GCTCAGCGAC 


ACTATGCACT 


120 


CACTCCTCGA 


CAACCAGCCA 


CCCGGAAGAG 


TGCTCGAGAT 


CGGAACTGGT 


ACCGGTATGG 


180 


TGCTTTTCAA 


TCTCGGCAAG 


GTTGAGGGAC 


TACAGAGCTA 


TGCCGGTCTT 


GAGCCCTCGC 


240 


GCTCCGTCAC 


TGCCTGGGTT 


AACAAGGCAA 


TCGAAACTTT 


CCCAAGCCTG 


GCAGGAAGCG 


300 


CCCGAGTCCA 


CGTTGGAACC 


GCCGAGGATG 


TCAGCTCCAT 


CAATGGACTG 


CGTGCCGATC 


360 


TCGTTGTGAT 


CAACTCGGTC 


GCCCAATACT 


TCCCAAGTCG 


AGAATATCTC 


GCTGAGCTGA 


420 


CGGCCAACTT 


GATTCGACTG 


CCCGGCGTCA 


AGCGTATTTT 


CTTCGGCGAC 


ATGAGAACCT 


480 


ATGCCACCAA 


TAAGGACTTC 


TTGGTGGCAC 


GAGCAGTCCA 


TACCCTAGGG 


TCCAATGCAT 


540 


CTAAGGCCAT 


GGTTCGACAA CAGGTGGCCA AGCTTGAAGA 


TGACGAGGAA 


GAGTTGCTTG 


600 


TTGACCCTGC 


CTTCTTCACC 


AGCCTGAGCG 


ACCAGTTCCC 


TGACGAAATC 


AAGCACGTCG 


660 


AGATTCTGCC 


AAAGAGGATG 


GCCGCGACCA 


ACGAACTCAG 


CTCTTACCGA 


TATGCTGCTG 


720 


TTATTCATGT 


GGGAGGCCAC 


GAGATGCCGA 


ATGGGGAGGA 


TGAGGATAAG 


CAATGGGCTG 


780 


TCAAGGATAT 


CGATCCGAAG 


GCCTGGGTGG 


ACTTCGCCGG 


CACGAGGATG 


GACCGTCAGG 


840 


CTCTCTTGCA 


GCTCCTCCAG 


GACCGCCAAC 


GTGGCGATGA 


CGTTGTTGCC 


GTCAGTAACA 


900 


TCCCATACAG 


CAAGACCATC 


ATGGAGCGCC 


ATCTGTCTCA 


GTCACTTGAC 


GATGACGAGG 


960 


ACGGCACTTC 


AGATGCAGAC 


GGAACGGCCT 


GGATATCGGC 


CACTCAATCA 


CGGGCGAAGG 


1020 


AATGCCCTGC 


TCTCTCAGTG 


GCCGACCTGA 


TTGAGATTGG 


TAAGGGGATC 


GGCTTCCAAG 


1080 


TTGAGACCAG 


CTGGGCTCGA CAACACTCCC 


AGCGCGGCGG 


ACTCGATGCT 


GTTTTCCACC 


1140 


GATTCGAAAA 


ACCAAGACAC 


TCGGGTCATG 


TCATGT TCAG 


GTTCCCAACT 


GAACACAAGG 


1200 


GGCCGGTCTT 


CGAGCAGTCT 


CACGAATCGC 


CCGCTACACC 


TGGTTCAGAG 


CCGCCGGCTG 


1260 


GAGGCAAAGG 


TCCGCGAGCG 


GCTGCAATCG 


CTGCTTCCAT 


CGTACATGAT 


TCCCTCTCGG 


1320 


ATCATGTTGC 


TCGATCAGAT 


GCCTCTCACG 


TCCAACGGCA 


AGGTGGATCG 


CAAGAAGCTC 


1380 


GCTCGACAAG 


CCCGGGTCAT 


CCCAACAATT 


GCCGCAAGCA 


CGTTGGACTT 


TGTGGCGCGC 


1440 


ACGCACGGAA 


ATCGAGGTCG 


GTTCTCTGCG 


AAGAATT T AC 


CGATCTACTA 


GGCGTCAAGG 


1500 


TCGGCATTAC 


AGACAACTTC 


TTCGAGTTGG 


GCGGCCATTC 


GCTGCTGGCC 


ACGAAACTGA 


1560 


GCGCACGTCT 


AAGTCGCAGA 


CTGGACGCCG 


GTGTCACTGT 


GAAGCAGATC 


TTTGACCAGC 


1620 


CAGTACTTGC 


TGATCTTGCT 


GCTTCTATTC 


GTCAAGGCTC 


GTCCCGTCAC 


AGGTCTATCC 


1680 


CGTCTTTACC 


CTACGAAGGA 


CCCGTGGAGC 


AGT 






1713 


(2) INFORMATION FOR SEQ ID NO: 5: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: CDNA 
5 (iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Tolypocladium niveum 

(B) STRAIN: ATCC 34921 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
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CATCAGCAAT 


CATGGGCAAC 


AAAGTCTTCT 


TCGACATTGA 


GTGGGAGGGC 


CCCGTCATGC 


60 


AGGGTTGCAA 


GCCTACCTCT 


ACCGTCAAAG 


AGCAGTCTGG 


TCGCATCAAC 


TTCAAGCTGT 


120 


ACGATGACGT 


CGTCCCCAAG 


ACCGCCGAGA 


ACTTCCGCGC 


TCTCTGCACC 


GGCGAGAAGG 


180 


GCTTCGGCTA 


CGAGGGCTCG 


TCCTTCCACC 


GTATCATCCC 


CGAGTTCATG 


CTCCAGGGCG 


240 


GCGACTTCAC 


CCGCGGT AAC 


GGCACTGGCG 


GCAAGTCCAT 


CTACGGCGAG 


AAGTTTGCCG 


300 


ATGAGAACTT 


CCAGCTGAAG 


CACGACCGCC 


CCGGTCTGCT 


GTCCATGGCT 


AACGCTGGCC 


360 


CCAACACCAA 


CGGCTCCCAG 


TTCTTCGTCA 


CCACCGTCGT 


CACCTCGTGG 


CTCAACGGCC 


420 


ACCACGTCGT 


CTTCGGCGAG 


GTCGCTGACC 


AGGAGTCCCT 


GGACGTCGTC 


AAGGCCCTTG 


480 


AGGCC AC TGG 


CTCTGGTAGC 


GGCGCTGTCA 


AGTACAACAA 


GCGCGCCACC 


ATTGTCAAGT 


540 


CTGGCGAGCT 


GTAAGCTATG 


GCATCTGTGT 


ATCTTGCGAT 


TTCCTGCACC 


CAATTCGGAC 


600 


GGACAAAAGA 


GGCGCTGCCC 


ACAGCAAGGA 


CCTTTGGTTC 


ACGGGACGGC 


TTGAA 


655 



30 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
35 (D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

40 (iii) ANTI-SENSE : NO 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GGG AT ATCGT GAATTGTAAT ACGACTCACT ATA 33 
(2) INFORMATION FOR SEQ ID NO : 7: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2157 base pairs 

(B) TYPE: nucleic acid 
& (C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

55 
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(iii) ANTI-SENSE: NO 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GGATCCGTGA ATTGTAATAC GACTCACTAT AGGGCGAATT CGCTCGACGT CACCTAGGAG 60 

10 ATCAGCCAGC TCCTTGGCCC TGTTCCGCAC GTTGATGCCC TGGTCTTTGC CGTTTGGATC 120 

GATGAAGTGG AACTGGCGCA GCATCTTCAA AAGTGTGATG TGTCCCCGAG CGTCATCAAT 180 

CACACGCTCA GAGCCATGCT TGACGAGGAA CTCGAGCAGT TGCAGAGCCT TGTAGATCTG 240 

GCGCCACTCC TCGGCCGACT TCTCCGTGAA CCGTCGATAT ATCATCGGCA TGATCTCGTT 300 

15 

GAGGGTTTGG CTGGTTCTGT TAGCTGAAGC CGGGCTGTTC AGTCGTCGAA CCGCGTACTA 360 

GTTGAAGGTG CCATTGGCAA TCTCCTGCAT AATACTGGAC GATGCTCCCC ATGGCTCGTT 420 

GTTCGTTGCC TCTCGGACCT AG TACACGG A GTTAGCCACC GTGTTAACAA ACCGTCGCGG 480 

20 

CCGCAGACTA ACCTTGGACT CCATCTCGGT ATAGTTCATA ACAGCTACAT GCCAGGTCAG 540 

CATTGGACGC GCCAGGGCTG AGGTCAGGCC TGGTACCATT TTGCGCCTTT CGGAACCCAG 600 

CCTTGAGGTC GTACAAGGTC AGGTTGGAGA CTGTGTTCTT GATGTCGTTC AAGTCCATTT 660 

25 TGGCAGATTC GACTTAGCGA GACCGGCCGG GAGCGGCAGA GGAGTTGTCG ATTCAGCACG 720 

AGTCGCTGAT GAGCGATGGT TGTGGTGCAA GTCGATGGTC CGAGGGCGGG TGGTAGAGGT 780 

GCTTGTCGCG ATGGACAGCT GGACTTTCGG GCCGCCAGCG ACACCTACCC GGCCTTGATG 840 

GGTCAGAGGG ATGATCACGT GATATGGGTC GGAGTCGCAT CGTACTTCGT ACCAGCATCA 900 

TCTCCAAGCC AGAGGCAGCA GAGATTATAT GACTGCAAAT GTGAAACGAA ATAAACCGTC 960 

AATATGGTAT TTATGTTGGC AAT TGCATGA TGCATCCCGG TGGAATTGAA CTAGAACGTC 1020 

GAGGGCTTGC ATACCAGAGG CTGCGGGTGC ATCGTGGGCA GCGGTACCTG AGACTTCAGG 1080 

35 

CCAGAACGAC TGCTAATAAG CCGCGACGGA GCCAAAACTT TTCCCCTTTC CAGAGGCTCT 1140 

CAGCTTTCGA CTCAGCCATT TGAACTTGCG ACTCAAGCCC GTTCATAACA CTTCATCTCT 1200 

TGTACTTCTA CCGCATTACC TCCTGTACGA ATTGTAATCC CAGGTATGTC TATTTTCCTG 1260 

40 TTGTTCTCGT CACATGCCCT CCCCAGCATG CGCAATGTCT TTGGACAACG CAGCTCCTCT 1320 

CGACACATCA CAAAGGCTTC ACCCAGCAGA GCACGCGAGA GCCTGCGCGC GACAGCCTGC 1380 

GAGCGACATG CAGCGCTTCC CTGGAAGCCA ACTGCACCAG CCTGGAAAGT TGCGCAGTTT 1440 

45 GCCAGGGGGC CTCCGTCCCC CAGAATGGAT GGCACTCCTC GGCTTGACCT GGAGCGCTGC 1500 

TCCCGATCAA GCCAGAGCCC GCCGGCGATG GGGACTGGCC GCGCCAGCCT C TGC ACATGA 1560 

GTGTGCTGGT TGGCTGGAGG TGGGTGGCCT TTGGCCTCCC AACCAGTCCC CACCATTTGC 1620 

TGGAAGCTGC TGCAGCTGGT CGGAACGCAC CCAAGCCGTT GAGCTCAGCG CTCTGTCGGG 1680 

50 

TCGAGCGCCC ATTGGGGTTC CCGCGAAGGT CCTTTGACTG GGCCGGGGCC ACTCGTCTTG 1740 

CCGGCCAGAG CTGAGCTCGC TGGTCTGGCA GCGACAGCAG CCGGGAGCTC CGTTGTCTAG 1800 
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GCGATGAGCG 


CAGCGGCCAG 


AGCTCCGGGC 


CGGATCGGTG 


ACCTCACAGC 


CGTGGAAGCT 


1860 


CCTGGGCCCC 


CGAATCAAGG 


ACCGCAATTC 


CACGTGACTG 


GCCGGTTGCT 


CCCCTTCCGG 


1920 


CATTGCCCGC 


CCCGCTATTA 


CACCCCTTTG 


CGCGCCCTGG 


TTGGTTCAAA 


GTCCCACCGC 


1980 


TAACTTTTAA 


CCCCTCCAGC 


AGCCTTCAAA 


ATGAAGTCAA 


CGCTCCTTCG 


ACCCCTCCTA 


2040 


CCCCGCTATA 


AGCTCTGCTC 


CCCCGGGTCA 


AGATCTTTCC 


CTCTTCCACA 


ACTTGCATCA 


2100 


GCTTCCAACA 


CATTCCGAGC 


TGCTCGATTC 


TTCTCCGCAA 


CATCAGCAAT 


CATCGAT 


2157 



Claims 

1. An isolated DNA sequence which codes for an enzyme having cyclosporin synthetase-like activity. 

2. A DNA sequence according to claim 1 which codes for cyclosporin synthetase or an enzyme that is at 
least 70% homologous thereto and that has cyclosporin synthetase-like activity. 

3. A DNA sequence according to claim 1 or claim 2 which codes for an enzyme that has cyclosporin synthe- 
tase-like activity and in which at least one amino-acid recognition unit is different from that of cyclosporin 
synthetase. 

4. A DNA sequence according to any of claims 1 to 3 which includes the 2890 bp Sail restriction fragment 
containing sequences 40239 to 43129 of Seq Id 1, or a sequence which hybridizes thereto. 

5. A DNA sequence according to any of claims 1 to 3 which includes the 2482 bp Sail restriction fragment 
containing sequences 37781 to 40244 of Seq Id 1, or a sequence which hybridizes thereto. 

6. A DNA sequence according to claim 1 which includes the sequence of Seq Id 1 , or a sequence that hy- 
bridizes thereto. 

7. A DNA sequence according to claim 1 which codes for an enzyme having an amino acid sequence as given 
in Seq Id 2. 

8. A recombinant vector containing a DNA sequence as defined in any one of claims 1 to 7. 

9. A recombinant vector according to claim 8 which has a restriction map as set out in any one of figures 2 
to 5. 

10. A host cell carrying a vector according to claim 8 or claim 9. 

11. A process for the production of cyclosporin or a cyclosporin derivative, comprising cultivating a host cell 
according to claim 10 and causing the host cell to produce the cyclosporin or cyclosporin derivative. 

12. A method forthe production of a cyclosporin derivative, comprising altering the DNA sequence coding for 
cyclosporin synthetase so that the enzyme causes the production of the cyclosporin derivative, placing 
the altered DNA sequence in a vector, transforming a host cell with the vector, and causing the host cell 
to produce the cyclosporin derivative. 

13. A method according to claim 11 in which the DNA sequence coding for cyclosporin synthetase is altered 
by changing the fragments that code for amino acid recognition units. 
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FIGURE 2 
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FIGURE 3 
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FIGURE 4 



7663,ClaI 
7582 f BglH 

7i41,PstI J_J 



EcoRI,249 
\ Pstl,339 
! /NcoI,358 



6797,SphI 
6574,Asp718 



6C83,Asp718 
5988, NotI 
5920,NcoI 
5868,SpeI 

5745,Bgin; 
5722,XhoI., 
5618,EcoRI 




5146,SalI 



4924,XhoI 

4749,EcoRI* 
4714,BamHI 



92 



EP0 578 616 A2 



FIGURE 5 
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